Peer Reviews - Sasquatch Genome Study

These are the leaked Peer Reviews from Dr. Ketchum's DNA Study  the "Novel North American Hominins, Next Generation Sequencing of Three Whole Genomes and Associated Studies". Two sets of reviews were leaked from the Journal Nature and one set from the "Journal of Advanced Zoological Exploration in Zoology" (JAMEZ) which gave the Ketchum DNA Study passing reviews and was planning to publish the study in its inaugural edition on or about January 11th, 2013.

Dr. Ketchum's statement via FaceBook that verifies the authenticity of the peer reviews.

As far as all of the documents that were leaked, they are authentic. I didn't leak them though. Some submitters as well as others associated with the project had access to this information so I don't know who did. Those communications are not supposed to be leaked. So if anyone gets sued, it won't be me. They can subpoena the blogger's email trail. I actually am happy about the leak though. At least now everything I have said can be authenticated including the ridiculous and biased nature of the reviews. Now people will know the truth and that we did pass peer review.
JAMEZ Peer Review with Dr. Ketchem's Responses:

Authors’ Response to Review

Referee A
I appreciate having had the opportunity to review this interesting investigation into such a controversial subject and recognize the enormous constraints under which the authors and their collaborators have labored.
Major concerns with this manuscript:
1. A difficult two-part thesis is posed that is inadequately substantiated by the analysis presented in the manuscript. Both parts (i.e. Part 1 - previously uncharacterized hominins exist in North America; Part 2 said hominins are the descendants of a putative hybridization event involving an ancient uncharacterized hominin and a modern human). A thesis this complex and counterintuitive requires significantly more in-depth analysis and consideration and should be developed through a series of peer-reviewed publications. Add in the idea that more than one species of hominin may be present in North America and the effort to make a convincing case is multiplied. 
Author’s Response:
We removed the discussion of the different variants of Sasquatch living in North America simplifying the manuscript and added an alternative hypothesis outside of hybridization to the discussion so further discussion of their origins can be fully developed in future manuscripts. Lines 688-696.

2. The presentation of the thesis is overburdened in three ways: a) the considerable inclusion of unsubstantiated eyewitness accounts and folkloric/pop-cultural ideations and their presentation as foundational facts that validate the analytical data; b) the presentation of a series of anomalous results generated using routine methodologies resulting in a sub-narrative that is curious and interesting, but unnecessary given the fact that cutting edge technology has been employed to generate sequence data sets that should allow for incredibly detailed analyses yet lead to relatively unambiguous conclusions; and c) the results from Next Generation Sequencing methodologies, especially related to the analysis of the nuclear genome are fairly superficially treated so as to be unclear whether the authors have selectively aggregated sequences to support a favorable placement among primates. As stated in Perleman et al. (2011; PLoS Genet 7(3): e1001342. doi:10.1371/journal.pgen.1001342, "...primate taxonomy is both complex and controversial, with marginal unifying consensus of the evolutionary hierarchy of extant primate species," and given the expertise of one of the authors in this particular area, it is a bit astonishing that a more in-depth and stepwise treatment is not provided. The decades of work published by the Goodman lab provide an excellent roadmap for a convincing analysis. In summary, this work would achieve a much higher degree of analytical credibility by a) limiting folkloric/cultural references to the minimum necessary, b) moving many of the references to anomalous results to a supplementary role (2nd publication, perhaps), c) fully leveraging the treasure trove of information yielded by the Next Generation Sequencing.
Author’s Response:
a)     The introduction with popular information was requested by another reviewer.  Since there is broad interest in this manuscript, that reviewer felt that the inclusion of such information was important.  I will leave it up to the editor whether this should be included or not.  It should be noted that 2 of the authors of this paper did have eyewitness encounters at the invitation of people that have interaction with this species so it is not entirely unsubstantiated. 
b)     The previous results were left in the manuscript because they supported the next generation sequencing results and also because they were obtained from a large number of purported Sasquatch samples, which behaved in the same manner as the three samples utilized for Next Generation Sequencing.
c)     The purpose of the manuscript was to prove that the Sasquatch exist.  We chose chromosome 11 as our starting point for building our supercontig in order to generate the phylogenetic tree since the scope of the manuscript was to determine whether a novel primate did exist in North America.  This was the logical way to proceed and with the length of the contigs and also with more than 379 genes analyzed (for another manuscript in the future) we feel that this was the direction to go to prove that they exist and as a preliminary study.  In the beginning of the study, we had already ruled out non-primate species with not only the mtDNA findings but also with PP16 since it only gives results for ape and human with the exception of the 103 peak at Amelogenin that appears when there is animal DNA present in a sample.  The 103 peak was not found in any of the purported Sasquatch samples that were screened using PP16. Furthermore, fully leveraging the genomic data will take years.  This manuscript is just the beginning of the study.
3. A significant amount of work leading to the generation of the data central to the thesis was outsourced. It is very difficult to manage the quality control and release standards of contract laboratories from the outside. This does not automatically call such data into question. However, when coupled with a significant amount of anomalous and ambiguous data, such concerns must be considered. This would be especially true concerning Next Generation Sequencing of unknowns where heterologous libraries may result from contamination. This technology is much more sensitive to the detection of contaminating gene sequences than others, and the possibility that contaminating sequences may affect the development of a consensus sequence should be given considerable attention. Reference bias should also be given consideration in the development of a consensus sequence. Similar work typified by that of Pruefer et al. (Nature, 2012/06/13/online) could provide good guidance concerning appropriate quality control analysis.
Author’s Response:
The outsourced laboratories were accredited and work with human samples, which require regulation.  We have correspondence where the laboratories were confused about the results and in some cases doubled checked their findings.  Furthermore, more than one laboratory repeated the analysis at least to some degree supporting the data of other laboratories.  They all utilized the same extractions that had been aliquoted.  As far as the Next Generation data, it confirmed the mitochondrial findings from previous laboratories as well as novel sequence.  That is why it was important to leave the original data in the manuscript.  Furthermore, according to supervisors in data analysis at Illumina, the Q30 quality scores would have been remarkably lower if contamination had been present in the genomes due to competition by the various sequences.  That also would have affected the mitochondrial findings which were consistent with the original mtDNA sequencing.
4. Conflicting statements occur concerning the validation against submitter contamination:
"Control DNA was obtained from the majority of the submitters and was profiled using Promega PowerPlex® 16 (20). All submitters yielded complete profiles."
Author’s Response:
This has been corrected to clarify on line 200: All submitters “tested”

At the very least the STR profiles for the submitters of Samples 26, 31, and 140 should be presented, as these samples yielded the most significant data of the study.
Author’s Response:
Since the submitters’ names are given per their request contractually in Table 1, it would be inappropriate and unethical to publish their DNA profiles as it would link to their identity. However, we have them on file.

5. Sample 26 - It is stated in the study that Sample 26 was derived from a shooting incident. The inclusion of such a sample in the study may be inconsistent with contemporary scientific ethics concerning the treatment of both human and animal study subjects and the procurement of research specimens from such. Moreover, this description casts the provenance of such a sample as murky, at best.
Author’s Response:
We have removed any references to the alleged shooting incident. Line 481

6. The bioinformatics should include gene sequences from expected outlier species that may also be capable of contributing contaminating nucleic acids. For example, a BLASTN search using Sample 26 does turn up some exceptionally strong homology with a gene from Ursus americanus (DQ240386.1). This would support the idea that the consensus sequence may have been affected by contaminant sequences.
Author’s Response:
There will always be some homology with other species when short random sequences are chosen, however, your example of bear contamination can be completely ruled out considering none of the laboratories handling the samples have bear samples.  Furthermore, forensic screening would have given a 103 peak at amelogenin on PP16 should there have been non-primate contamination.  Sequencing of the mitochondrial DNA with universal primers also would have shown any contamination of the original extractions with non-human DNA.  Additionally, that is a single isolate from the black bear 7193328 brain-derived neurotrophic factor. Not only is it preset in Sample 26, but it is just as present in Sample 140.  Furthermore, DQ240386.1 is 489 bases. I wonder why such a small sequence, ie 489 bases out of 2.7 million bases was the focus of this critique.  DQ240386 is statistically significantly aligned with primates and carnivores.  In fact, BLASTing DQ240386- ring tailed cats of the raccoon family and seal have as much alignment as Ursus americanus. The maximum score for raccoon and seal are about 850. Maximum score for Ursus americanus is about 900. Max score for Sample 26 is 538. This shows contamination bias.

Minor concern
1. The use of “hominin” in the introduction, as opposed to hominid, creates the perception of bias toward folkloric and unverified eyewitness accounts that seem to emphasize human characteristics. The general discussion of such creatures tends to split along two lines: human or human-like and ape or ape-like. The term hominid, as related to the Family Hominidae, is the higher order, and, thus, the broader grouping. Visual description from a distance might allow for a reasonable description of a hominid, but close examination of the organism or its remnants is required for its placement into the Hominini. Thus, to establish initial objectivity, the authors should use a qualifying adjective such as “putative.”
Author’s Response:
Pan/Homo divergence occurred between 5.4 and 6.3 million years ago (refs below). Since the mtDNA speciation is not only unequivocally modern human, but can be dated by the haplotypes obtained in this study to as late as 13,000  to 26,000 years ago, these individuals must be included as hominins.
We did add the word “putative” in line 85.

Bradley, B.J. (2006). "Reconstructing Phylogenies and Phenotypes: A Molecular View of Human Evolution". Journal of Anatomy 212 (4): 337-353. doi:10.1111/j.1469-7580.2007.00840.x.
Wood and Richmond.; Richmond, BG (2000). "Human evolution: taxonomy and paleobiology". Journal of Anatomy 197: 19–60. doi:10.1046/j.1469-7580.2000.19710019.x. PMC 1468107. PMID 10999270.

Referee B
1.       This is clearly important information that I hope the public will have access to soon. However, I was immediately taken by surprise after thoroughly reading the manuscript to see such a high reference to Hominins, including the title. I was surprised because there is no substantial evidence presented by the author that the species identified in the 3 whole genomes is a biped. Eye witness accounts are the only data presented to determine or substantiate a biped of any kind. It would be much more appropriate to delineate in this manuscript that novel genomic evidence highly suggests an unknown species living contemporaneously in the Continental and Subarctic United States and Canada within the Order: Primate. There is no conclusive evidence that the unusual data found in the follicular hair morphology or the tremendous genome results indicates a "hominin". The tribe Hominini (of Homininae) is reserved for Homo, post Panini divergence. Genetic information alone is not enough to classify the species within a tribe or clad under order - Primate. Though there is substantial alignment in some data relating to hair and other DNA studies, it is too presumptuous to purport the extant data included in the manuscript conclusively classifies this unknown species as hominin. An example (to the author) of how to, perhaps, revise:
---The genomic data of sample 26 indicates the specimen allegedly collected from an unknown creature is certainly novel and indicates an unknown species. The phylogenetic trees of sample 26 indicate a species highly aligned with the order: Primate. If the alleged story related to us by the hunter who submitted the sample, the species is closely related to the hominin tribe because the hunter claimed to have seen the creature walking "upright". Our chromosomal analysis would substantiate a close hominini relationship. Further study and analysis will be ongoing to determine such classification. The phylogenetic trees are quite compelling and are probably the most substantial of the information given within the manuscript. However, the information and data is not satisfactory to hypothesize anything more than a living unknown primate. The molecular genetic data is quite compelling and enough to publish in and of itself. But in order to use taxonomic language in this manuscript, especially related directly to the species itself- is inappropriate. In order to include the level of specific taxonomy presented in the manuscript such as hominini, gross anatomical evidence along with molecular physiological evidence must be included in the manuscript. Even a clear video of this species actually walking bipedal would give some sort of gross anatomical evidence. But eye witness accounts are not enough to make such hypotheses. I do, however, believe the eye witness accounts are important and should not be excluded. They are just not appropriate to utilize for evidence to propose any hint of taxonomy, including hominini.

Author’s Response:
Pan/Homo divergence occurred between 5.4 and 6.3 million years ago. Since the mtDNA speciation is not only unequivocally modern human in 100% of the samples included in this study, and furthermore, can be dated by the haplotypes obtained in this study to as late as 13,000 to 26,000 years ago including the mtDNA from the 3 Next Generation genomes, these individuals must be included as hominins.  The human mitochondrial DNA haplotypes were consistent between the Next Generation Sequencing and the original mtDNA whole genomes sequenced in the beginning of this study.

1.     Bradley, B.J. (2006). "Reconstructing Phylogenies and Phenotypes: A Molecular View of Human Evolution". Journal of Anatomy 212 (4): 337-353. doi:10.1111/j.1469-7580.2007.00840.x.
2.     Wood and Richmond.; Richmond, BG (2000). "Human evolution: taxonomy and paleobiology". Journal of Anatomy 197: 19–60. doi:10.1046/j.1469-7580.2000.19710019.x. PMC 1468107. PMID 10999270.

2.         The molecular genetics in this manuscript are the most important and it would be important to include information regarding the analysis of the whole genomes. The phylogenetic trees are exciting to look at and speculate about, but there is not enough analysis to determine phylogenicity. I highly suggest including language that 'it is recognized that continued analysis of the nuclear DNA will be required in order to determine phylogenicity. The genomic information is certainly impressive, but not as conclusive as the manuscript proposes. It is inappropriate to make connections between the various samples. In fact, I believe this weakens the manuscript due to the reality that the genomic data and mtDNA analysis does not align well against each other (except for 2 of the whole genomic sequences). There are similarities, and those similarities are important to note. But conclusions cannot be drawn about the inter-relatedness of these various samples. The number of samples do indicate high alignment with Homo along chrm11. However, there are many other novel sequences across the several samples that creates confusion for the reader. Perhaps it may be helpful to split the manuscript up into several different studies- or simply divide the manuscript up into major sections to delineate the different evidenciary aspects. But to attempt to make links between the hair samples and the novel genomic sequences and other DNA evidence is premature. Tremendous work would need to take place with substantial evidence, including gross anatomical evidence, to make such linkages between the various samples in the study.

Author’s Response:
Mitochondrial DNA has been universally utilized to determine species by the scientific community.  Since the mtDNA in Sasquatch is not only unequivocally modern human in 100% of the samples included in this study, that places the Sasquatch as human.  However, since the nuDNA is novel to a large extent, the speciation of a potential progenitor is what is in question, especially since there are gene sequences that align 100% with human interspersed in the nuclear genome.

3.         The other major problem with this manuscript is, again, the alleged specimens these samples came from. It is important to state the samples came from alleged Sasquatch specimens, which is done well throughout the manuscript, but not consistently. It would be helpful, however, to include additional evidence to corroborate the theory that the samples came from a large non-human biped. Video evidence of this species walking bipedal would be important to help strengthen the manuscript. In addition, it is inappropriate to make statements regarding the dynamic between Homo Sapien DNA and the whole genome sequence. It is appropriate to include statements about distant relationships with the order- Primate. But it is certainly premature and the evidence is not conclusive enough to make such close connections between this unknown species and Homo Sapiens. I do believe that there is strong data in the whole genomes to suggest alignment with Homo Sapiens. However, there is not enough evidence to make statements about hybridization events. Hybrid phenomenon across the Homo/Pan/Gorilla taxonomy are currently highly debated in the scientific community. Making such claims is far too premature for a manuscript like this. The manuscript demonstrates a universal bias toward a hominini hypothesis. This anthropomorphic bias is problematic for this manuscript. A scientific manuscript is intended to present facts, not biases. If the editorial board decides to publish this manuscript with such a clearly "hominid" favoured bias, they certainly should proceed, but I recommend doing so with a strong disclaimer that an authored bias does exist in the manuscript.

Author’s Response:
Since the mitochondrial DNA places them as human, in both the original sequencing of the mitochondrial whole genomes as well as the Next Generation Sequencing, it is not premature to align them with Homo.  We did add another hypothesis in addition to the hybridization theory on lines 692-696. In consideration of the mitochondrial whole genomes, these are the only two viable hypotheses available.

5.    The inclusion of the Q30 scores was very important for the author(s) to do. This data strengthens the manuscript more than anything else included (if editors wanted to publish based on the Q30 scores alone, then there would be no reason not to publish). Furthermore, I am also most impressed with the presentation of the phylogenetic trees for the 3 whole genomes. They clearly identified primate relationships and alignment- though sample 31 seems to be very different from the other two whole genomes. That is concerning and confusing. It may make more sense to remove sample 31 altogether until the study can further understand and explain the huge difference with sample 31 (including conclusive evidence, not speculation). I am also pleased with the inclusion of the high definition pictures of human hair morphology versus the unknown species comparison. However, it would strengthen the manuscript to cross reference human morphology so as to substantiate the statements regarding the comparisons. I am also pleased to see the substantial amount of data and work taken to ensure high quality DNA samples with clear and appropriate techniques to reduce and eliminate contamination; this process was well done and substantiated.

Author’s Response:
References 15-19 refer to hair analysis and human hair morphology as well as animal hair analysis. 
As far as Sample 31, it should be included because it has the same characteristics as all of the other samples, human mitochondrial DNA and novel primate sequence in the genome as well as human sequence.  In the original manuscript we discussed that eyewitness accounts clearly note physical differences related to geographic location in North America.  We feel that 31 is such a variant.  We had taken this information out of the manuscript but will re-introduce it if requested.  The fact that overall, the genome is consistent with the others as far as its makeup clearly establishes its place in the manuscript. 
6.    The pictures included thus far of a supposed creature that is claimed to be a Sasquatch is of terrible quality and should be replaced with much more clear photographic evidence or video. These pictures weaken this manuscript. The inclusion of an image of sticks being put together in some kind of pyramid is inappropriate. I would strongly suggest removal of this image.
Author’s Response:
We included the stick structure because one of the samples (168) was obtained within this structure.  Not only did this document chain of custody for this sample but the fact that a viable sample was found within the structure supports the discussion of eyewitnesses encountering unusual stick structures within areas purported to be inhabited by the Sasquatch.  As far as video, we are including Supplemental Video 1 of a juvenile female Sasquatch sleeping.  This video is from the Sasquatch that Sample 37 was obtained from which also lends credibility and chain of custody to that sample.  The still photos have been removed from the manuscript with the exception of Figure 4 which has now been edited to tie it to Supplementary Video 1 on lines 109-110 since the video is clearer and in hi def. 

Journal Nature Peer Review First Submission with Dr. Ketchum's Responses:

Author Responses to Referees
Referee #1 (Remarks to the Author):
I must state at the outset that I am not a geneticist, and hence not fully qualified to evaluate the
DNA data in proper critical fashion. Bottom line: there is obviously the backbone here of a paper
that includes a lot of interesting data, but it needs a lot of work before it can be accepted.
1. My main advice (discussed further below): include better graphics (especially a similarity
tree/distance tree/phylogram/cladogram of some sort).

Authors’ Response: We have added 9 phylogenetic trees and added more data in the form of
whole genome sequencing.

2. Remove the problematic conclusions on taxonomic status and the unnecessary section on
recent hominin discoveries, and tidy up the nomenclature and wording.
 Discussion of fossil species is probably irrelevant
The manuscript opens with a section on how recently discovered fossil hominins have changed
our views on the hominin diversity of the recent past. While it seems logical to mention these
taxa in passing at least somewhere in the manuscript, I think it's a bad idea to start the article off
with a discussion of these forms - is this really the area of investigation that most requires review
when writing about the possible existence of sasquatch? I would say no. The authors are meant
to be addressing the possible existence and identity of an extant population, not adding to the
roster of fossil forms. In other words, the discussion of Neanderthals, Denisovans and so on
seems like unnecessary padding. There are some basic mistakes concerning terminology and the
use of binomials (e.g., the name _Homo floresiensis_ is incorrectly written with capital first
letter on the species name). Neanderthals do not have the name they do "because of their skeletal
morphology" (rather, they are distinguished by their skeletal morphology).

Authors’Response: We removed the taxonomic recommendation. Additionally we removed
the recent hominin discoveries (Neanderthal and Denisovan).

3. It would seem far more appropriate so far as I can see to begin with a discussion of the
controversy surrounding the purported existence of 'mystery hominoids', and to perhaps allude to
some of the other technical studies that have claimed to find evidence for the existence of these
alleged creatures.

Authors’ Response: The discussion of the controversy surrounding the existence of these
hominins was moved to the introduction and expounded on, including photos and video.
References were added from other scientific papers and books published addressing the existence
or the discussion of the existence of these hominins.

4.-- Demonstrate with better clarity what the DNA samples represent
It is stated in the discussion of the collected DNA samples that those "not consistent with _Homo
sapiens sapiens_" were then evaluated further. I feel the authors must elaborate on what it was
that made it clear that the samples were not part of _H. s. sapiens_ - the normal graphic way of
representing this is, of course, a gene tree, distance tree or cladogram of some sort. If the authors
are saying that the samples are from a hominid and hominin, but come from a taxon outside of
_H. s. sapiens_, they need to state it more clearly and provide better evidence and clearer
graphics. Any diagram should also make it clear that other mammals (a useful list is included in
the manuscript) were definitely out-groups relative to the group that includes the 'mystery'
samples and those of definite hominins.

Authors’Response: We clarified and removed the verbiage stating "not consistent with _Homo
sapiens sapiens_". We have added 6 mitochondrial phylogenetic trees which clearly show the
samples had modern human maternal origins and were generated via conventional sequencing as
well as phylogenetic trees extracted from next generation whole genome sequences. We provide
more evidence by sequencing 3 whole genomes at the University of Texas core lab and using a
subsample of extracted reads. The reads were assembled to create a consensus sequence using
the human chromosome 11 as a reference. These concatemers (supercontigs) were used to find
sequence homologs and generate 3 phylogenetic trees, one for each whole genome sequenced.

5. -- Terminology seems odd and needs changing
The terminology used throughout this manuscript, and the conclusions the authors reach, seem
inappropriate in view of the evidence. Indeed, the title tells us that evidence for a 'new species' is
presented, yet the authors actually end up naming a new 'subspecies'. I am definitely of the
opinion that the naming of a new taxon seems inappropriate at this stage (it is likely to be about
as accepted as Meldrum's suggested ichnotaxonomic name _Anthropoidipes ameriborealis_), and
I would also add that the chosen name ('_Homo sapiens feralis_') is odd and highly problematic
(use google to see what I mean). It is stated throughout the ms that the animal is of hybrid origin.
If this is so, it is highly debatable as to whether or not taxonomic novelty is warranted.

Authors’Response: We have re-written and re-arranged the manuscript to use better
terminology and also have removed any taxonomic references other than to call the hominins

6. I would like to know exactly what is meant by those statements noting that the "paternal
lineage [is] completely unknown", as the authors seems to be introducing a new layer of mystery
to their conclusions. It seems radical enough that they are positing a hybrid origin for this
putative animal, but are they also invoking the existence of an additional animal that was
involved in the proposed hybridisation event? This all seems very peculiar and I am not
convinced that the evidence presented in this manuscript explains it adequately.

Authors’Response: By sequencing 3 whole genomes that failed to align with any animal or
hominin found in NCBI, there is no other conclusion other than that the paternal nuclear DNA
origins are unknown. Previous data suggested that novel sequences and high failure rate were
found in the nuclear DNA sequencing and STR testing., The addition of the three genomes
further supports our initial findings. The nuDNA and mtDNA origins of the Sasquatch are
discordant, with mtDNA indicating human maternal lineage. Analysis of the 3 next generation
whole genome sequences and analysis of preliminary phylogeny trees from the Sasquatch
indicate that these individuals possesses an anomalous mosaic pattern of nuclear DNA
comprising sequences that are distantly related to primates interspersed with sequences that are
closely homologous to humans.

7. I would also suggest that the phrase "unknown morphology of the hair" is inappropriate:
rather, the authors are reporting a morphology that is novel.

Authors’ Response: We changed the phrase "unknown morphology of the hair" to
“microscopic morphology of hairs classified as “novel” and “novel hairs”.

8. All in all, this manuscript seems to report the discovery of a novel North American hominin
lineage as determined by DNA analysis (so far as I can tell, the substantial discussion of that data
is appropriate and does consider most relevant factors/avenues of investigation). This is,
obviously, potentially, a significant, Nature-worthy discovery. But it is marred by poor choice of
presentation (that is, the absence of a tree that immediately conveys the position of the samples
to those of other taxa/populations), unnecessary discussion of fossil forms, and inappropriate,
confusing and rather naive proposals concerning nomenclature and the alleged hybrid origin of
the alleged animal. I would definitely like to see this manuscript 'salvaged', but it would need a
thorough revision and re-organisation. I wish the authors the best of luck in their continuing
Authors’ Response: We re-organized and revised the manuscript and added extra data to
support our original findings.

Referee #2 (Remarks to the Author):
The authors analyse some biology samples (mainly hairs) putatively belonging to the Big Foot;
they analyse the hair microscopica structre, the mtDNA, forensic STRs markers and some few
nuclear genes -including the MC1R- and conclude the results indicate the existence of a
previously unknown human species, which they call Homo sapiens feralis. I am not going to go
into the specific details of the results, which are quite confusing and methodologically debatable,
but will explain what the average molecular biologists would do in two different scenarios
related to the problem presented in this work.

1. Identify a biological sample of unknown specific origin, say, hairs, or coprolites or blood
stains. This approach is widely used in zoology, for instance, to distinguish between wolf and
dog after an attack to a sheep herd. Usually the people uses universal primers to amplify a
diagnostic DNA fragment that could help in the identification of numerous species. 16S is
probably the most widely used, although in specific situations and for trying to find out the
precise match, you may want to design additional tests on cytb, nuclear SNPs, complete mtDNA,
etc. This has not been done here; the use of human primers will amplify human mtDNA, as has
been the case. Moreover, the fragmentary mtDNA data does not support any unknown hominin
lineage, because the haplotype/haplogroup attribution fits well in what is already known of the
modern human mtDNA phylogeny.

Authors’Response: The evidence samples were screened with not only published cytochrome b
primers utilized for species identification but also THR/DHL which are universal mammalian
primers in the literature utilized across HV1 for species identification. This is stated in the
manuscript however, we have made it more prominent. We also had previously sequenced a
number of whole mitochondrial genomes which are in the manuscript. We have increased that
number. The initial mtDNA screening of the samples with the universal primers yielding only
human results. These findings are consistent with previous attempts by other labs and scientists
to validate the existence of Sasquatch through mtDNA which we have documented in the
Discussion. There was no DNA of any other species in the mtDNA in any of the samples
utilized in this study. The whole mitochondrial genomes were consistent with modern human as
were the samples that yielded enough mitochondrial sequence to assign a haplotype but not a
whole mtDNA genome. Even the samples with very little DNA (not enough to achieve a
haplotype) were screened with a short HV2 sequence and gave only human sequence. This
mtDNA alone did not support an unknown hominin, however the same extractions, when
sequenced on various nuclear loci and amplified with PowerPlex 16, yielded unexpected and
aberrant results with some loci yielding normal human sequence along with novel sequences not
found in genetic depositories. To further support our findings, we have sequenced 3 next
generation whole genomes from which 3 mitochondrial DNA sequences were extracted and were
homologous with human and the previous sequencing. The chromosomal sequences were
discordant and novel, supporting the original findings.

2) Another scenario. We want to prove that the specimen we are studying is an undescribed, new
species. Of course the definition of what a species would look like genetically is a tricky
question (see for instance, the Denisovans), but you would be expected to generate a huge
amount of genomic data (if possible, the complete genome), construct phylogenetic trees and
show that your specimen represents a deep, undescribed clade in the current phylogeny. Even so,
some might argue that genetic divergence is not equivalent to species difference, but at least you
would have a good point to support your claims. This has not been done here, and the analysis of
three random nuclear genes cannot be used for the purpose of defining a new species.
In short, the conclusions are not supported by the methodology. I would suggest the authors to
generate complete mtDNA genomes and, if possible, suficient nuclear data, even from shotgun
approaches, and build phylogenetic trees with all possible mammal mtDNA genomes and
nuclear data available at genbank.

Authors’ Response: We added more data to the manuscript by sequencing 3 whole genomes
using next generation sequencing. Previous data suggested that novel sequences and high failure
rate were found in the nuclear DNA sequencing and STR testing. The addition of the three
genomes further supports our initial findings. The nuDNA and mtDNA origins of the Sasquatch
are still discordant, with mtDNA indicating human maternal lineage. Analysis of the 3 next
generation whole genome sequences and analysis of preliminary phylogeny trees from the
Sasquatch indicate that these individuals possesses an anomalous mosaic pattern of nuclear DNA
comprising sequences that are distantly related to primates interspersed with sequences that are
closely homologous to humans.

Referee #3
(Remarks to the Author) I believe that among the most important abilities defining a real
scientist is his/hers ability to stay open-minded and accept that the world may be radically
different from common believes as long as such radical claims are supported by sufficient hard
scientific evidence. Such a radical claim is presented by Ketchum et al. that basically propose the
existence of new contemporary sub-species of homo in North America termed Homo sapiens
feralis. The claim is based on morphological and DNA based analyses of samples such as hair
and bark shavings, tissue, toenail, saliva, and blood samples taken by various people in areas
where unusual bipedal hominin like creature has been visually observed. Based on these analyses
the authors claim to have found evidence of a new contemporary hominin with unique hair,
strange, nuDNA, and human mtDNA. An exceptional claim such as this demands for exceptional
convincing evidence - something the authors do not have:

1. First the data does not make logical sense. Mitochondrial DNA genomes being identical to
that of contemporary humans can only be explained be relative recent interbreeding between this
new hominin and woman of Caucasian descent. Did such interbreeding go back many thousands
of years one will expect differences to modern mtDNA genomes. The mtDNA results are hardly
explainable unless one believe that American woman of Caucasian descent (within the last 200-
300 years as its America) runs around in the forest having sex with a undiscovered hominin and
leavening the baby to their care take of the new hominin (as the rest of us have not heard about
such hybrid babies yet the baby must be send of). It is also sticking that the entire mtDNA
lineages of this new hominin is poorly human. One would expect at least some mtDNA genomes
coming out as being accordance with this being a new hominin or if nothing els some of the
mtDNA being Native American (everything being equal they have been in America more than
10,00 0 years).

Authors’ Response: We now have two samples that did yield American Indian haplotypes.
These were late arriving samples and were added to the manuscript. As far as when this species
arrived in the United States, we do not know, however due to H haplotypes in their
mitochondrial DNA, the age of these hominins is less than 15,000 years. However, they could
have arrived in the United States before Native American peoples according to the Solutrean
Theory now added as a reference in this manuscript. As far as the mitochondrial DNA being
homologous to human, we used next generation whole genome sequencing on three samples.
The mitochondrial DNA sequences were extracted from the genomes and all three were
consistent with the same human mtDNA sequence previously sequenced in the beginning of our
study. The nuDNA however, was a mosaic of human sequence interspersed with novel sequence
related to primate lineage. This supports the previous findings reported in the original

2. Secondly, PCR based methods and methods used for SNP detection by the authors are known
to be highly unreliable when applied to minor amounts of degraded DNA (I know that personally
for the SNP detection approach and simple PCR). This is especially the case for nuDNA
templates that are more prone to be affected by damage than those of mtDNA (due to copy
number differences). In fact copy number differences can very well explain the DNA findings
i.e. the specimens are human in origin, why the authors amplify mtDNA genomes marching
contemporary Caucasians. The nuDNA , however, is too poor quality for amplifying the
attempted sequence length and creating PCR artifacts. It makes good sense even for the hair
sharft samples that by nature has degraded DNA (even when taken "fresh") . I am sorry but this
appears to be a much more straightforward scenario than having a previously undetected (by
science) hominin sub-species running around in the forest mating with Caucasian woman. I am
by no means convinced that this has anything to do with a new hominin subspecies. To make a
compelling case I need seeing mtDNA genomes and large numbers of nuDNA sequences that
points in direction of a new hominin species i.e. ape or human like without being identical to
know species.

Authors’ Response: We have now included Figures 7 and 13 that show the quality of the DNA
on yield gels. There was little to no smearing and the DNA was pristine. This supports the fact
that the DNA was not degraded enough to mar the results. We sequenced 30 mitochondrial
whole and partial genomes and they were all homologous with modern human mtDNA. We also
sequenced several nuclear loci from the same extractions which encompassed a number of long
sequences up to 900 bases. It is difficult to amplify and sequence long amplicons with degraded
DNA. In order to ascertain if the novel nuclear DNA was an anomaly, we used next generation
sequencing to generate 3 whole genomes to determine if the nuclear DNA was indeed novel. If
the DNA was badly degraded, the sequencing of whole genomes would have been impossible.
The 3 extractions utilized for next generation whole genome sequencing passed all quality
controls in our laboratory and the university core laboratory prior to sequencing. The libraries
generated from these samples also passed stringent quality control measures in the core
laboratory prior to the next generation sequencing. If the libraries had not passed QC, the
sequencing would not have been performed. We have furnished 9 phylogenetic trees
(mitochondrial and nuclear) to support the results of this study.

3. I'm not an expert on hair morphology but I expect that identification mistakes are made.

Authors’ Response: The hair expert that examined the hair is a forensic trace evidence
supervisor in a large forensic laboratory. He performs hair analysis and testifies about his
findings in court on a daily basis. He also testifies concerning and examines hair from many
species of animals as well as human. Furthermore, he had a wide collection of animal hair
standards with which to compare these hair samples utilized in this manuscript.

Referee #4 (Remarks to the Author):
The authors seek to identify the species of 130 unknown, though purportedly hominin, hair and
tissue samples. The samples are subjected to multiple forensic and genetic tests, including hair
analysis, mitochondrial and nuclear genome sequencing and electron microscopy. They postulate
that, while the mitochondrial genomes of all tested samples are conclusively human,
discrepancies in Y-chromosome STR and amelogenin amplification, lack of sequence homology
to known species, as well as structural abnormalities in DNA viewed by electron microscopy are
indicative of an unusual hominin source. They identify this source as a new species, which they
call Homo sapiens feralis.

Extraordinary claims require extraordinary evidence. At no point do the authors provide
adequate evidence to support their outrageous claims. The paper suffers from a myriad of faults,
some of the most egregious being:

1. Results are not documented in any adequate detail. For example, the DNA sequences
determined are not given, rendering any coherent understanding of the results impossible.
Authors’ Response: Sequences not shown in mutation reports have been added to the
Supplemental Data. Mutation reports are now added to the manuscript in the Supplemental Data
allowing transparency of sequences previously reported as well as new sequencing that has been

2. Failures in tests such as SNP analysis, amplification and electrophoresis are taken as evidence
for differentiation, when they are likely explained by DNA degradation or contamination. In one
instance the authors attempt to replicate DNA degradation by leaving a blood sample at room
temperature for 4 days and using the sample as a positive control in further testing. They fail to
assess the extent of degradation quantitatively, making this positive control uninformative.

Authors’ Response: We have added Figure 13 to show the level of degradation of the human
control sample in comparison with some of the Sasquatch samples in the study. We have added
3 whole genomes that were successfully sequenced using next generation sequencing
technology. These genomes supported our previous data and were novel. Furthermore, we have
added photos of the raw extracted DNA on agarose to support the quality of the DNA.

3. When an unknown DNA sequence is amplified from a sample this is taken as evidence that it
comes from the purported unknown hominin when in fact mispriming from DNA of an unknown
microorganism is a much more plausible scenario.

Authors’Response: We used next generation sequencing to sequence three unique whole
genomes. The findings from these genomes support our previous findings. These samples were
high quality and yielded outstanding genomes. The quality control prior to sequencing ruled out
any large degree of bacterial contamination. We also provided new Figures 7, 8 and 13 to
address any levels of degradation and contamination. Figure 13 and 8 as well as the
histopathology report on Sample 26 which was added intentionally to show that that sample 26,
which was one of the whole genomes sequenced, was neither degraded nor had a high
concentration of bacteria. Figure 7 was a yield gel showing some of the DNA utilized in this
study. Note that the samples are not smeared.

4. Genetic differentiation based on electron microscopy is improbable. On this scale, one cannot
distinguish the differences between any two species. The observed structural differences, if
legitimate, are at best indicative of DNA damage.

Authors’Response: The electron microscopy was not intended to provide species identification
but was included as supporting evidence for the unusual behavior of the amplified DNA. We
have further addressed this in the revised manuscript.

5. There is a minimum of statistical analysis. One would like to see, for instance, a phylogeny
based on the mitochondrial genome or HVI regions, or even the pair-wise distances between the
samples and other humans.

Authors’Response: We have provided six mtDNA phylogenetic trees and three nuclear
phylogenetic trees derived from the 3 whole genomes sequenced with next generation
sequencing and have shown the pair-wise distances.

6. The authors do not follow the proper protocol for naming a new species.
Authors’Response: We have removed any taxonomic references and have chosen to call the
hominins Sasquatch.

Journal Nature Peer Review Second Submission with Dr. Ketchum's Responses:
  Author Responses to Referees 2

Referee #1 (Remarks to the Author):
Comments re: The sasquatch genome project
1.  Firstly - I reviewed a previous version of this ms and am pleased to see that the authors have wholly restructured their ms (it is now substantially improved, with many tangents and irrelevant areas of discussion removed or substantially modified). The previous version was rather opaque - was it dealing with sasquatch or not? I am pleased to see that this new version is bolder and more direct. However, does it present satisfactory data, and include satisfactory analysis of that data?

Given that much of the analysis and discussion concerns genetics, and given that I'm not a geneticist, my general feeling is that any decision about the fate of this ms must fall 'into the lap' of geneticist reviewers. Based on more general issues discussed here, I conclude that the work currently suffers from some areas of obfuscation that prevent me from understanding exactly what it is the authors are proposing. I therefore cannot recommend publication at the moment: if relevant geneticists argue for rejection of the ms, I would concur with this opinion. I admit to being greatly intrigued by the data being presented, however, and (as per last time) wonder if there is a 'Nature-worthy signal' in here somewhere.

The discussion early on of "high quality complete genomes" and so on sounds impressive, and I hoped to be impressed by the data collected and analysis of it. However, right out of the gate (1st line of abstract) we are told that the tissue samples concerned "were obtained... from... Sasquatch". This is problematic - the authors should state that the tissue samples were hypothesised to be from sasquatch, since the tissue samples were not collected directly from animals themselves: the project is testing the hypothesis that the samples come from an unidentified hominid.

Authors’ Response:
Though a number of the samples were taken immediately after eyewitness sightings of Sasquatch, we did change the verbiage to correct this, Line 36, “hypothesized to be” from Sasquatch.

2. The comment, also appearing early on in the ms (p. 4), that the DNA evidence reveals "human DNA interspersed with sequence [data] that is novel and distantly related to primates" raised my eyebrows - what do the authors mean by this? That sasquatch, if imagined to be verified as real, is a hybrid of human and non-primate ancestry? Non-primate? I was interested in seeing this qualified: does it stem from a misuse of the term 'primate'? (I think we all agree that humans are primates).

Authors’ Response:
Line 59: Corrected to read: human DNA interspersed with sequence that is novel but primate in origin.

3. It is further stated on p. 27 that phylogenetic results "indicate distant relationships with multiple primate sequences". Again - what exactly do the authors mean here? This is never explained and I regard it as a major problem - the universally favoured hypothesis (as made clear elsewhere in the present ms) is that sasquatch (if real) is a primate, and specifically a hominoid, and presumably a hominid, a hominine and perhaps a hominin, so the text creates the impression that a non-primate ancestry is being preferred. Or, again, is this because 'primate' is being used to mean 'non-human primate'?

Authors’ Response:
We omitted the word distantly in Line 584.  The novel primate lineage comes off in the non-human range even though there are human sequences interspersed in the genome. So, we were referring to non-human primate.

4.  I was hoping that this would be clarified by examination of the phylogenetic/gene trees. While I'm pleased that several phylogenetic/gene trees have been included, they desperately need redesigning, since the text on the branches is so small that it can't be read without exceptional magnification. As such, the trees are essentially useless. 

Authors’ Response:
We have attempted to enlarge them but in PDF, they can be enlarged in Reader.  We are still trying to make them easier to read and currently have them at a publisher attempting to enlarge them without changing the actual data and distance tree.

4. I appreciate that the authors have apparently tried to include as much data as possible in this ms. I would say that the photographic evidence (Fig 4A and 4B) should not have been included, since those images are poor in quality, do not appear compelling, and weaken the credibility of the ms. Scientific names are written incorrectly on p. 23. 

 Authors’ Response:
We removed the still of 4A and relabeled 4B as Figure 4.  Supplementary video 1 shows the same Sasquatch in greater detail and in high definition and therefore supports the use of Figure 4, however if requested, we will remove 4 also.  Since this individual is the donor of Sample 37, we felt that tying her to a photo and video was important.  We will remove Figure 4 if the editorial board or the reviewers believe that it should be removed after this explanation.  Lines 584 through 586 were corrected as far as the scientific names.

Referee #2 (Remarks to the Author):

Short list of some scientific problems:

1. Repeated and abusive use of vague terms such as "multiple", "numerous", "many" or strange scientific terms such as "anomalous", "strinking", "unexplained" or "unexpected". 

Authors’ Response:
The words “multiple”, “numerous” and “many” were either removed or greatly reduced in their usage.  "Anomalous", "striking", "unexplained" and "unexpected" were completely removed from the manuscript. 

2. There is a complete lack of statistical evidence or referenced scientific work for the hair analysis; claiming "they were strikingly distinct from human hairs" without any reference or evidence beyond this bold, authoritative statement is not acceptable. 

Authors’ Response:
References 15-19 refer to the hair analysis. Figure 5 illustrates the differences between Sasquatch hair and human hair.  Materials and Methods also discuss the methods utilized for hair analysis.  Statistics are not utilized in forensic hair analysis.

3. The vagueness of the extraction procedures is also surprising (how the contamination was minimized or DNA recovery maximized?): "The DNA was extracted in a clean room using forensic science procedures that minimized contaminant DNA in the samples while maximizing DNA recovery."

Authors’ Response:
The Material and Methods were supplemental due to the length of the paper and it was stated in the manuscript.  The extraction techniques were discussed at length in the Supplemental Materials and Methods. 
4. Some paragraphs are confusing and again, the use of terms such as "clearly" or "as expected" is strange. "Control DNA was obtained from the majority of the submitters and was profiled using Promega PowerPlex16 20. As expected, all submitters yielded complete profiles. In contrast, when the hominin samples were tested using Promega PowerPlex16, partial profiles were evident in almost all cases. These data clearly showed that the samples were not contaminated by their submitters." (how?)

Authors’ Response:
The words clearly and as expected were removed.  Lines 265 and 266 were edited “These data showed that the samples were not contaminated by their submitters since all of the submitters’ profiles excluded as being a contributor to the hominin profiles/partial profiles.” to show how the samples were not contaminated by the submitters.

5. The authoritative and unsupported statements follow along the paper: "In the experience of this laboratory, hair samples constitute a very reliable source of DNA." 

Authors’ Response:
We had previously had the following in the paper but when we revised it, we discarded it due to the increased length of the manuscript, however, we have reentered it into the manuscript, Line 280-283: “Almost all large animal breed registries utilize plucked hair samples as their primary source of DNA for their parentage testing programs.  One horse registry alone processes over 100,000 hair samples per year.  These hair samples are archived and are viable for many years after they are submitted.  Therefore the hominin samples were well within our expectations as far as nuclear DNA yield and quality”. 

6. Some methodologies are totally unspecific and uninformative: "After extraction, yield gels with 3 L of the extracted DNA were utilized to determine if there was DNA present and whether it was degraded". 

Authors’ Response:
The Material and Methods discuss this and it is visualized in Figure 7.  Yield gels are a well established method for visualizing not only the quantity of DNA in an extraction but also the quality of the DNA since DNA will “smear” on a gel instead of giving a distinct band if the DNA is degraded.  Mild degradation is depicted in Figure 13 with the human control that had been purposely degraded prior to extraction.

7.  More vagueness (we don't know how many samples are all): "All of the screened samples revealed 100% human cytochrome b and hypervariable region 1 sequences". 

Authors’ Response:
Line 317 changed to “All 110 screened samples”.

8. More methodological problems; in degraded DNA, artifactual amplifications are expected and potentially undescribed alleles have to be sequenced to see what's in the PCR product. "PowerPlex16 amplification of the hominin samples yielded only partial profiles with off-ladder alleles while amplification of DNA."  ents : "Once it was established that the Sasquatch nuclear DNA did not conform to human DNA.."

Authors’ Response:
 The DNA was not degraded as we had a yield gel of the raw DNA showing clear bands (Figure 7) not smears. As some of us are forensic scientists, we are expert in determining DNA quality and mixture (contamination) interpretation and the interpretation of PP16 which we use in court for our DNA profiles.  We included references which support our statement as well as the peak heights on electropherograms showed adequate DNA where the amel X dropout should not have occurred (Figure 9).  We also sequenced amel X and the results are seen in Table 4. References 43-46 address the X  Dropout.  We can address this exponentially if necessary.

9. More vagueness: how were the complete mtDNA genomes obtained?  Apparently the samples were sent to a company, I guess they amplified it in overlapping fragments? of which length? Or were they captured with some enrichment method and posteriorly sequenced with next generation technologies? One cannot say the samples were sent and results sent back and that's it

Authors’ Response:
From Materials and Methods:  Aliquots of purified DNA from all of the samples in this study, along with human controls to monitor for possible contamination, were shipped to Family Tree DNA. Proprietary methods were used to amplify the mitochondrial DNA genome. The DNA was amplified using 48 sets of human specific primer pairs that overlapped. Extra primers were developed and utilized in case of failure due to mutation. The amplicons were sequenced on an Applied Biosystems® 3130xl Genetic Analyzer. 

10. Instead of generating meaningless mtDNA phylogenetic trees, the authors should check the well known mtDNA phylogeny (for instance, at PhyloTree) and tell us exactly the haplogroup, and/or subhaplogroup and haplotypes of each sample. It will be self evident anyway that these mtDNA genomes fell within the modern human variation, no need for any tree. 

Authors’ Response:
We were asked for the mtDNA phylogenetic trees in the first submission so we furnished them.  They were actually important since the same mtDNA sequences were found in the next generation whole genome sequencing as in the original mtDNA sequencing.  We furnished trees for both the original individual mtDNA sequencing as well as the sequence pulled from the whole genomes.  We furnished a table with haplogroups for the various samples (Table2).

11. Unsupported claims (by the way, allelic dropout is a well known phenomena in degraded samples and still the most plausible explanation): "It is noteworthy that AmelX allele dropout occurs in significant numbers of the unknown samples yet seldom occurs in normal human testing." 

Authors’ Response:
We have previously addressed that the samples were not degraded.  We also furnished references for the X dropout (References 43-46) and the peak height (RFU) of the Y peak precludes degraded or insufficient DNA as a cause of  X dropout (Figure 9).  The samples were tested twice with different methods (PP16 and amelogenin only) and the dropout repeated.  Amel X was also sequenced and it failed on most of the samples though the human controls provided normal sequence.  Three of the authors have seen literally thousands of PP16 profiles from both paternity and forensic applications.  Only one author has actually seen one case of X dropout and it was because of mutation, not degradation or low yield DNA.  From the paper: The genotyping of the amelogenin locus produced the most consistent results across the samples tested. The DNA samples yielded four types of results: XX, XY, Y and null. The dropout of the X amplicon was the most significant of the findings observed with the STR genotype analysis of Amelogenin. (Figure 9, Supplementary Data 3) This dropout was reproduced in several individual samples and was repeatable both in the multiplex of PowerPlex 16 and the analysis of the STR locus, so it is unlikely to be an experimental artifact due to low quantity or degraded DNA (Table 3). The repeatability and number of samples exhibiting the X dropout is inconsistent with what would be expected with normal human allele dropout43-46. It is noteworthy that AmelX allele dropout occurs in significant numbers of the unknown samples yet seldom occurs in normal human testing.

12. More vagueness (how long? how do you assess pristiness?): "DNA samples that yielded long and pristine sequences". 

Authors’ Response:
The quality of the DNA sequences can be assessed by viewing the electropherograms.  These are available if needed.  Also the Q30 score for the whole genomes denoted the extremely high quality of the sequences, Lines 544-558.

13.  More methodological problems (the observed -again, vague- results are likely due to the amplification of environmental and degraded DNA yielded unspecific PCR products):"The resulting sequences ranged from totally non homologous matches, not found in Genbankafter multiple BLASTs (including dissimilar sequence BLASTs) to novel SNPs and even failure to sequence" 

Authors’ Response:
The quality of the DNA was addressed using Figures 7 and 13 and the Q30 scores.  The novel sequences and SNPs repeated in the genomes although there was no failure in the whole genomes.  Failures in the early testing can be attributed to primer design failing to amplify novel sequence.

14. More methodological problems: the MC1R sequences (which primers?, which length?), come apparently from PCR products that were not cloned (this is an standard procedure while working on ancient DNA samples to ascertain the heterogeneities present in the PCR product and also the original extract). Thus the C to T change observed in two samples could be the result of cytosine deamination, a well known phenomenon in degraded DNA. In any case, what can be deduced from this section (and also from the mtDNA and the MYH16 section) is that this samples are modern human DNA samples. 

Authors’ Response:
We have previously discounted degradation of the DNA using visualization by yield gel and Q30 scores on the next generation sequencing.  Also, this is fresh, contemporary DNA, not ancient DNA.  Sample 26 was worked up thoroughly including histopathology that showed fresh tissue with no degradation or bacterial contamination.  Since a whole genome was sequenced with fresh tissue and still provided the same results as were obtained in the early testing, the only conclusion is that the genome is novel and high quality.  It is no different than any other genome sequenced today with fresh tissue or blood.  We have added the primers as Supplemental Data 9.

15. Meaningless experiments and sections (obviously the low performance in the SNP chip is due to low quality DNA): "In an effort to mimic severely degraded DNA that could explain the strikingly low SNP matches obtained, one of the human controls submitted along with the unknown hominin samples comprised non sterile blood that was purposely maintained at room temperature in a moist environment for 4 days in an effort to maximize degradation of the sample. Upon visual inspection, hemolysis of the sample had occurred and bacterial contamination, which often correlates with DNA degradation, was seen. An acrylamide gel was loaded with the degraded human sample to assess the degradation and was visualized with ethidium bromide. Smearing was observed".

Authors’ Response:
If the DNA is degraded, it will smear on a yield gel.   Yield gels have been used for years in forensics and standard DNA testing to assess DNA quality from fresh specimens. We have figures of two yield gels showing DNA that is not degraded and two samples that are showing some degradation, including the human control, Figures 7 and 13.

16. More meaningless sections (artifactual or missing bands are common in ancient DNA, this has been known for about three decades now): "Some of the samples appeared to produce normal amplicons that resulted in bands consistent with the human controls. Other samples displayed clear bands that appeared to be of different sizes than those expected of normal human amplicons. Yet others had multiple bands. Still other samples failed to amplify at all." 

Authors’ Response:
This is fresh DNA and fresh dried DNA and the degradation issue has already been addressed.   The same samples would sequence long pristine (via electropherogram) sequences up to 900 bases at other loci consistent with human as well as novel sequences hundreds of bases long. 

17. More vagueness (which ones?): "and forensic techniques to ensure that there was no human contamination"

Authors’ Response:
See Materials and methods: “Since the presence of normal human DNA contamination of submitted samples was a primary concern throughout this study, all samples were thoroughly cleaned in a manner consistent with forensic testing procedures. In order to further rule out contamination from human personnel and lab workers, samples from submitters and scientists working with the samples were collected for comparison with the results obtained in the various DNA tests.
Hair samples were then sorted into two groups for extraction at DNA Diagnostics. DNA from those samples containing 5-50 or more single hair roots were selected and the roots clipped into 1.5 mL microcentrifuge tubes. The hair roots were thoroughly cleansed with water and ethanol prior to extraction to remove any extraneous DNA.
Hair roots were placed in microcentrifuge tubes for DNA extraction and ATL buffer (Qiagen) was added. These samples were digested with proteinase K (PK, 20 mg/mL) and dithiothreitol (DTT, 1.0 M) at 56°C overnight, followed by a three-step organic extraction procedure using phenol:chloroform:isoamyl alcohol (25:24:1) with an additional PCI extraction. This process was followed by a butanol wash and buffer exchange/concentration into TE-4 buffer (10 mM Tris, 0.1mM EDTA, pH 8.0) using Microcon®-100 ultrafiltration devices (Millipore, Billerica, MA)92-93.
The remaining unknown hairs with only 1-5 hair roots were sent to the North Louisiana Criminalistics Laboratory (NLCL, Shreveport, Louisiana) for DNA extraction and purification. The roots were cleaned with water prior to digestion. The cleaned roots were digested in ATL buffer (Qiagen), PK, and DTT at 56°C until completely dissolved, which generally was overnight. The DNA in this crude extract was purified using the EZ1® DNA Investigator Kit with cRNA (Qiagen) and eluted into TE-4 on a BioRobot EZ1® (Qiagen).
Saliva swabs, blood swabs, and tissue cuttings (10 mg) were placed in microcentrifuge tubes for DNA extraction. The samples were extracted using the above mentioned organic method with the exception that DTT was not used during digestion.
Reference samples, in the form of buccal swabs from submitters who collected the unknown hair and tissue samples, were isolated using 50mM NaOH and heated to 100 for 10 minutes followed by the addition of 1M Tris (pH 8.3)98. The DNA extracted at DNA Diagnostics was quantified using a Nanodrop spectrophotometer (Thermo Scientific, Willimgton, DE). Hair samples sent to the NLCL were quantified by real time PCR using the Applied Biosystems Quantifiler® Human kit on an Applied Biosystems Prism® 7000 Sequence Detection System99.  Samples that yielded DNA concentrations too low to use in standard testing had their DNA concentration augmented using multiple displacement amplification method per the manufacturer’s instructions86.
DNA was then visualized on a 1% agarose gel to determine DNA quality by loading 3 µL of DNA extraction.  Appearance of the bands determined the quality and quantity of the DNA extraction (Figures 7 and 9)”.

18. More vagueness: "According to the laboratory, the sequences themselves were of very high quality" 

Authors’ Response:
We have added information and the summary generated by the HiSeq 2000 Next Generation Sequencer showing that the genomes were of high quality via the Q30 score, Lines 478-492, and removed the statement above since it is no longer necessary due to the generated Q30 scores.

19. More methological problems (environmental, contaminating DNA will produce no matches at GenBank, this is common in all ancient DNA genomic projects): "Some sequences produced no homology matches when BLAST searched against all primate, human, Neanderthal, Denisova, and other sequences in Genbank."

Authors’ Response:
We previously addressed that 1. The DNA is not ancient and 2. The DNA was not degraded or contaminated both by visualization and testing in the case of preliminary findings and by the Q30 scores generated by the HiSeq2000 for the whole genomes.

20. Although they claim to have generated 30x genoms, I understand they have only assembled and analyzed the chromosome 11, they don't explain why. 

Authors’ Response:

We explained that chromosome 11 is highly conserved in primates.  In the paper: “Thus, the selective supercontigs comprised an abundance of neural associated and putative tumor suppressor sequences all of which are highly conserved in primates and humans and clearly establish that the Sasquatch is closely related to humans. The high homology with multiple primate lineages (including but not limited to, chimpanzee, macaques, gibbons and marmosets) and with humans as demonstrated in phylogenetic trees (Figures 19, 20, and 21) indicate that the supercontigs contain highly conserved human and primate gene sequences”.  
The goal of this manuscript was to prove that there exists an unknown primate living in North America since eyewitness reports consistently describe a creature with the appearance of a primate.  It takes years to analyze a genome so in order to prove that we had a novel primate, it stood to reason that an area of the genome that is highly conserved in primates be the first area of the genome to be investigated.  The supercontigs supported the existence of a novel primate using chromosome 11, which was the goal of this manuscript.  

21. The obvious explaination for this section is that the sample is a mixture of DNAs or has been contaminated at the Sequencing Service (did they use a tag sequence for this particular project?; was the service sequencing primates?): "the Sasquatch consensus sequence that showed homology to human chromosome 11 reference sequence is distantly related to multiple primate lineages including Homo Sapiens, Pan Troglodytes (Chimpanzee), Macaca Mulatta (Rhesus Monkey), Nomascus Leukogenys (White cheeked Gibbon) and Callithrix Jacchus (Common Marmoset)." 

Authors’ Response:
 The UT core lab only sequences human DNA and this statement has been added on Lines 472-473.  We also previously addressed that we have added the Q30 scores for the genomes.  These scores absolutely prove that there was not a mixture of species in the whole genomes as explained above. Lines 544-558 from the manuscript: The run summary generated by the HiSeq 2000 next generation sequencer provides scores, Q30, for run quality. Q30 can also be used to determine if there was any contamination (or mixture) found in the samples sequenced.  According to Illumina, a pure, single source sample would have an average Q30 score of 85. However, if there was contamination present in the sample sequenced, the divergent sequences would compete against one another causing a contaminated sample to have a Q30 score of 40 to 50.   The Q30 scores for the first read for the three genomes sequenced had Q30 scores of 92, 88 and 89 respectively.  The second read was slightly lower 88, 84.25 and 83.66, but still in line with the 85 average. The Q30 is the percent of the reads that have the statistical probability greater than 1:1000 of being correctly sequenced. Therefore, not only were the sequences from a single source, but the quality of the sequences were far above the average genome sequenced using the Illumina next generation sequencing platform. The high quality of the genomes can be attributed to the stringent extraction procedures utilized whereby the DNA was repeatedly purified.   This ultra-purified DNA also allowed for greater than 30X coverage of the three genomes.  The summary of the next generation sequencing generated by the HiSeq 2000 Illumina sequencer is furnished as Supplementary Data 7.

22. Sequencing data should be freely available to the scientific community after publishing (even better, before). 

Authors’ Response:
We attempted to upload all sequences to GenBank but they refused them (I have the email and it said because we do not have a species name).  We asked how to do it and they refused to return our emails or calls).  As a result, we attached all sequences as supplemental per Dr. Gee's request.

23. In my view, there conclusions are not supported by the data. What do we have here is likely low quality DNA samples belonging to modern humans that yield some conflicting results in unspecific genotyping approaches. I suggest also some background contamination in the next generation sequencing from previous primate sequencing projects, again likely influenced by the original low quality of the samples.

Authors’ Response:
This is an incorrect assumption as the data was repeatable and of good quality.  We have defended our position numerous times (above) and have added new data (Q30 scores) supporting our position that the DNA is not degraded nor is it contaminated.

Referee #3 (Remarks to the Author):

I appreciate the additional work the authors have undertaking trying to meet the concerns raised by me and the other reviewers. However, I am afraid that I do not find the paper worth publishing. The authors still lack explaining in any convincing way how this "new species" carries mtDNA genomes identical to those of modern humans only. I also believe that alternative explanations exist as to why the nuDNA genomes differs from that of contemporary humans e.g. mapping a mixture of animal DNA and human contamination to human reference genomes. 

Authors’ Response:
We felt that the data should speak for itself so we did not include a theory concerning the lack of novel SNPs in the mtDNA or the presence of the modern human mitochondrial DNA across all 110 samples.  We have added a theory to the Conclusions Section in the manuscript, Lines 734-738.

We also previously addressed that we have added the Q30 scores for the genomes.  These scores absolutely prove that there was not a mixture of species in the whole genomes as explained above. Lines 544-558 from the manuscript: The run summary generated by the HiSeq 2000 next generation sequencer provides scores, Q30, for run quality. Q30 can also be used to determine if there was any contamination (or mixture) found in the samples sequenced.  According to Illumina, a pure, single source sample would have an average Q30 score of 85. However, if there was contamination present in the sample sequenced, the divergent sequences would compete against one another causing a contaminated sample to have a Q30 score of 40 to 50.   The Q30 scores for the first read for the three genomes sequenced had Q30 scores of 92, 88 and 89 respectively.  The second read was slightly lower 88, 84.25 and 83.66, but still in line with the 85 average. The Q30 is the percent of the reads that have the statistical probability greater than 1:1000 of being correctly sequenced. Therefore, not only were the sequences from a single source, but the quality of the sequences were far above the average genome sequenced using the Illumina next generation sequencing platform. The high quality of the genomes can be attributed to the stringent extraction procedures utilized whereby the DNA was repeatedly purified.   This ultra-purified DNA also allowed for greater than 30X coverage of the three genomes.  The summary of the next generation sequencing generated by the HiSeq 2000 Illumina sequencer is furnished as Supplementary Data 7.

Journal Refuses to even read study. Makes claim that hairs could have come from any source with not basis in fact.

1 comment:

  1. Thank you so much for publishing this highly informative information. While the controversy is now a few years old, the black eye the academic community gave to itself by attempting to abrogate Dr. Ketchum's study will long be remembered. We must all remain vigilant and wary of credentialed pomp and self serving stonewalling, lest the truth be told.