Sunday, 15 October 2017

Yet another MyHeritage DNA mystery

This one is another in the category of what you can't see you DO miss, ie another example of a close match reported by, in this instance, FamilyTreeDNA, and GEDMatch, but not shown on MyHeritage.

Original tests both done on FamilyTreeDNA (May 2010 and Oct 2016), both uploaded to GEDMatch and both transferred to MyHeritage.

Actual relationship 2nd cousins.

Predicted relationship 2nd to 3rd cousin sharing a total of 144cMs, largest segment 19 cMs
Looking at only those segments over 7cMs this reduces to a total of 96 cMs over 7 segments  (including a 10cMs X chromosome match)

Chromosome Start Location End Location centiMorgans (cM) # of Matching SNPs
1 90373522 107499183 14.1 4500
2 50979064 71469042 18.47 5500
3 173752942 179284891 8.02 1300
5 171792682 180625733 18.64 2613
17 74799181 78639702 8.85 1023
18 4458642 11750915 17.58 2693
X 131769934 139569333 10.25 925

One to One comparison at the default settings of 500 SNPs minimum 7cM segments
Comparing Kit A and Kit B
ChrStart LocationEnd LocationCentimorgans (cM)SNPs
Largest segment = 28.1 cM
Total of segments > 7 cM = 111.1 cM
7 matching segments
Estimated number of generations to MRCA = 3.5

672062 SNPs used for this comparison.

This maternal cousin does show up on the MyHeritage match lists for my two maternal aunts, both 
sharing 5.5%, 395cMs total, largest 63cMs, one with 14 segments, the other with 18.
But nowhere to be found on my match list, nor I on hers.

Looking at the FTDNA Chromosome Browser picture for my 2nd cousin compared to myself and the maternal aunts clearly shows at least three segments where I match at least one of the aunts (chr. 1,2,5) at which spots those matches total about 52cMs, well above what I'd consider a match from MyHeritage

Has been logged with MyHeritage Support for their scientists to take a look

Wednesday, 20 September 2017

Plea for MyHeritage DNA

Oh I do so wish MyHeritage would hasten along their promised Chromosome Browser so that where a match actually matches can be easily seen without a plea for them to upload to GEDmatch - although the latter upload is highly recommended regardless *.

* bewitched bothered and bewildered by GEDMatch?
Check out Jim Bartlett's latest post on his segmentology blog: Getting started with GEDMatch

My second wish is that such a tool would also differentiate between
- the components of the match that are real, from comparing apples with apples, and
- those bits that are imputed by MyHeritage in order to bring the set of SNPs being compared up to the full set
which latter comparison is apples with bananas, the last word being chosen deliberately.

My assumption is that this latter imputation is responsible for the example below (logged with MyHeritage for an explanation, and the two pleas above.)

An excited match contacted me about one of my kits saying "I'm a 1st cousin 2*removed can I see your tree please" adding later that "the connection should be easy to find".

Leaving aside that the tree was connected and visible, and that the match is actually listed under the heading MyHeritage labels as:
"DNA Matches listed below are lower quality matches that should be viewed with skepticism"
what MyHeritage reported was 

ie a total of 31.8cMs over 3 segments, largest segment 15.6cM

What Gedmatch reported at the default settings for the two ancestry kits, both presumed to be recent, ie post their chip change May/Jun 2016

Comparing Kit one and Kit two
Minimum threshold size to be included in total = 500 SNPs
Mismatch-bunching Limit = 250 SNPs
Minimum segment cM to be included in total = 7.0 cM

Largest segment = 0.0 cM
Total of segments > 7 cM = 0.0 cM
(2228) No shared DNA segments found

441115 SNPs used for this comparison.

Lowering the minimum segment cM to 5cM showed two small segments:
Chr Start Location End Location Centimorgans (cM) SNPs
1 120,064,541 151,437,807 5.5 684
9 127,173,028 131,032,682 5.1 524
Largest segment = 5.5 cM
Total of segments > 4 cM = 10.6 cM

We cannot yet find, and do not expect to, this match on Ancestry.

Wednesday, 30 August 2017

Assorted DNASurnames updates

Several updates have occurred to the DNASurnames Haplogroup tree that shows key branches, the patriarch, and optionally, the tester, for surnames/lineages of interest to assorted projects, sometimes sub-projects of much larger ones.

The haplogroup R tree now includes in particular:
- BigY driven changes around R-S7361, including the new branch created for my Henderson family
- ditto around R-ZZ7_1 (well R-Y16467*) for the Richardsons of Morebattle subset of the Richardson-2 project
- a place holder at R-M269 for the Dawe family of Lamerton, Devon while we await BigY test results for the line

On the lineages side, the RICHARDSON pages in particular have had a revamp to include a theoretical DNA signature chart linking up the 4 yDNA STR matching lines to their parent terminal SNPs to see what the derived ySTR signature for the progenitor would be.

Another MyHeritage mystery

Previous posts have documented some of the vagaries experienced with MyHeritage's DNA matching.

This one makes me wonder how those without an ability to compare at other companies would even  realise that they may be missing a close match!

I expect that some matches may well be false positives given the imputations MyHeritage does to compare test kits from assorted companies, but I didn't expect to NOT find a match reported as predicted 2nd cousin by Ancestry, 3 generations by GEDmatch and predicted 2nd to 3rd cousin by FamilyTreeDNA.

The comparison details:

Ancestry: both tested there
Predicted relationship: 2nd Cousins
Possible range: 2nd - 3rd cousins
Confidence: Extremely High
Total shared: 208 cMs over 11 segments

FamilyTreeDNA: one tested, the other transferred in from recent-ish Ancestry test
Predicted 2nd to 3rd cousins total 202 * cMs, largest segment 57
* effectively 175cMs when < 5cM segments excluded

GEDMatch: one uploaded from FTDNA, the other from the recent-ish Ancestry test #
At the defaults for one to one comparison:
Largest segment = 56.2 cM
Total of segments > 7 cM = 212.5 cM
11 matching segments
Estimated number of generations to MRCA = 3.0

# these kits being the same used for the MyHeritage uploads.

This has been reported to MyHeritage Support for their comments and explanation.

Monday, 7 August 2017

Visualise your Ancestry DNA Matches

Drowning in the analysis of your Ancestry match lists?
Can't figure out who best to contact?
Or how to get their attention by targetting your message?

Check out how to Visualise your Ancestry DNA Matches
Shelley Crawford has written several very clear posts about how to use  NodeXL (and's Ancestry download tools) to make sense of the morass of data.
Thank you to whoever mentioned this on the FaceBook group  Genetic Genealogy Tips & Techniques.

The instructions enabled me to reasonably quickly get from my seemingly endless screeds of data to be filtered and diced and sliced to:
Firstly this:

and then this filtered set of groups that actually do have interactions - or have a match of particular interest in them:

It does also rather highlight that I have a dearth of interconnected Ancestry matches up to "4th cousins" - only 147.
Of that 147 there's one 1st cousin (not included in the networks above to reduce clutter), one 2nd cousin, 6 predicted 3rd cousins (5 known to range from 2nd to 3C1R, the 6th unknown), and the remainder (139) predicted 4th, only 23 of whom are "placed".

Within an hour of posting my first targetted message to a 29cMs match asking if they had any idea how the three people connected I had a response - with a Perthshire connection identified between them, so possibly that Henderson/Millar brickwall will crack one day after all - once I figure out where it can be fitted into my tree! 

Thursday, 18 May 2017

Benefits of BigY and yFull

The Fairbairn surname DNA project has reaped the benefit of testing a group for BigY and having the results analysed at


A new branch, I-Y32666,  with an estimated TMRCA of 225 ybp, has been created under I-Y7277  on the I1 haplogroup tree.

Now all we have to do is convince a few more of the matching Fairbairn kits, or an Elliot or two, to similarly test, to see how far back from that the Fairbairns and their matching Elliots (a very small subset of that surname project) diverged, or were still together for that matter!

Wednesday, 3 May 2017

MyHeritage result compared to Ancestry/FTDNA/GedMatch

A "cousin" (3C2R) was selected by MyHeritage to be one of their initial DNA tested population.

His match has just come through (kit had to be re-sent), so I thought it a good opportunity to compare results across companies given he (SB) had tested at MyHeritage and Ancestry, as well as transferring his Ancestry test to FamilyTreeDNA and MyHeritage.
I've tested at Ancestry and FTDNA, and transferred the latter to MyHeritage.

We are both on GedMatch: T087062 (my Ancestry kit on GedMatch is set to Research only to avoid duplicating matches and cluttering result lists) and A533474.

The moral of the story appears to be:
Fish in as many ponds as you can afford, and make sure you upload to GedMatch to get a consistent view across companies as the data certainly varies, particularly in the total shared DNA.

The largest segment shows as a more consistent value across different companies than the total DNA shared, which is way too variable to be a reliable indicator from any one company.

The largest segment (excluding Ancestry which doesn't report such details - grrr!)  ranges from
51.3 (GedMatch phased to his father) to 56.3 (MyHeritage) where FTDNA comes in at 52 so all within cooeee.

The biggest differences come in on the total shared DNA ranging from 44 to 104 cMs!
MyHeritage wins at a massive 104.8cM, presumably because my transferred in kit has about 40cMs imputed from their database when compared to GedMatch's 60cMs total shared over 7cMs or 20cMs imputed compared with FTDNA's 89 total.
Ancestry only shows 44cMs total shared.

On GedMatch even reducing the match parameters to 1cM and 300 SNPs the total shared was still only 66cMs compared to FTDNA's 89 (and increased the segments shared to 4 with one under 5cMs now being shown - compared with Ancestry's 3 shared segments)

Intriguingly however, his initial kit transferred to MyHeritage from Ancestry (v2) shows a grand total of 6 matches - and does not include me, nor the other two cousins we share, who are on MyHeritage that do all show up in his actual MyHeritage test, one of whom does not show up in my transferred in kit but does on GedMatch and FTDNA (53 total/20 cMs largest).

Predicted relationship comparisons:
MyHeritage: 1C2R - 3C1R
Ancestry: 4th cousins - range 4-6th cousins
FTDNA: 2nd to 4th cousins
Actual: 3C2R

The details:

His match to myself (mine a transfer in from FamilyTreeDNA, his a direct test)

MyHeritage result:
Estimated relationships
1st cousin twice removed - 3rd cousin once removed
DNA Match quality
Shared DNA
1.5% (104.8 cM)
Shared segments
Largest segment
56.3 cM
(no changes after the apparent recalculations being done around 4 May 2017)
Ancestry result (both tested here, me v1 SB v2):
Predicted relationship: 4th Cousins
Possible range: 4th - 6th cousins (  )
 Confidence: High
 where the info button showed a total DNA shared of 44cMs over 3 segments

FamilyTreeDNA result (his a transfer from Ancestry v2):
2nd Cousin - 4th Cousin    89  52   X-Match

where  there are 2 segments over 5cMs, 4 segments counting all over 3cMs and 16 segments when counted over 1cMs, including the X.
The X match consisted of two segments, both less than 3cMs (and cannot be on the line we are known to share).

GedMatch result (both from Ancestry tests (me v1 SB v2), at defaults from the one to one comparison)

ChrStart LocationEnd LocationCentimorgans (cM)SNPs
Largest segment = 52.2 cM
Total of segments > 7 cM = 60.1 cM
2 matching segments
Estimated number of generations to MRCA = 3.9

403006 SNPs used for this comparison.

GedMatch with his match phased against his father's (at default settings):

ChrStart LocationEnd LocationCentimorgans (cM)SNPs
Largest segment = 51.3 cM
Total of segments > 7 cM = 59.2 cM
2 matching segments
Estimated number of generations to MRCA = 4.0

402849 SNPs used for this comparison.

Reducing the comparison to 300 SNPs and 1cM:

ChrStart LocationEnd LocationCentimorgans (cM)SNPs
Largest segment = 51.3 cM
Total of segments > 1 cM = 66.7 cM
4 matching segments

402849 SNPs used for this comparison.