About SCIENCE.NaturalNews.com![]() SCIENCE.NaturalNews.com carries on the tradition I began in 2006 with the launch of several nutritional reference websites including HealingFoodReference.com, HerbReference.com and NutrientReference.com. Those sites were well received but their data sets were quite small and difficult to expand. So beginning in the spring of 2012, I embarked on an effort to index and categorize the entire PubMed library of science provided by the National Institutes of Health and funded by taxpayer money. This data set currently encompasses more than 21 million studies, with approximately 3,000 new studies appearing each weekday. While PubMed currently makes all these studies available to the public, they are hidden behind a convoluted interface that makes it all but impossible for the public to easily find the most relevant studies they're interested in -- especially if those studies involve two things such as "green tea" and "breast cancer."
In other words, people essentially want to know the scientific answers to questions like: * Does pomegranate juice prevent prostate cancer? * Does green tea prevent breast cancer? * Do some vaccines cause autism? * Do cholesterol drugs cause muscle fatigue? * What are the negative effects of cadmium in foods? ... and so on. To answer those questions in a user-friendly format, SCIENCE.NaturalNews.com presents over 7,000 keyword terms and phrases categorized into clear, easy-to-understand categories such as "minerals" or "toxic chemicals." Most importantly, we run two-keyword algorithms against the entire PubMed data set to determine which items are most closely associated with other items. (Currently two million studies are included in our data set, and this is being expanded by approximately one million studies each week.) So if you visit our page on MILK, you will discover that the most frequently associated result of milk is ALLERGIES. This is statistically calculated from the entire body of scientific studies. If you click on ALLERGIES from the milk page, you will get a page listing all the studies that cover both MILK and ALLERGIES. Instantly, SCIENCE.NaturalNews.com gives you all the relevant studies that might otherwise take you dozens of hours to find on your own. We also have similar pages for almost any two health-related concepts you might imagine. A Shortcut to Scientific KnowledgeAs you can see, SCIENCE.NaturalNews.com is not merely a categorization of the entire PubMed library of scientific studies; it's also a shortcut to knowledge that allows you to instantly discover relationships between nutrients and health, chemicals and diseases, medical therapies and side effects, and so on. It also gives readers the opportunity to comment on and discuss the scientific studies being presented. The pages also link to related NaturalNews.com articles where much of the science behind these studies is presented in a more friendly article format. The technology isn't perfect, of course. It still cannot tell the contextual difference between the heavy metal "lead" and a description of a "lead study" meaning the leading study having nothing to do with heavy metals. But a solution to that is relatively simple, and we'll be rolling that out soon. The Technology Behind Science.NaturalNews.comAs you use the site, you'll notice that the response time is incredibly fast, even as you drill down into two-keyword indexing of studies. I am the chief R&D architect and engineer behind the site, and I developed an optimized set of "content tokenization algorithms" to accomplish the results you see on the site. I first developed these algorithms during the engineering of the content system that powers NaturalNews.com. I then significantly expanded and refined these algorithms for the SCIENCE project. Today, these algorithms are able to determine the relational weights between any two text-based concepts as they exist across a very large body of content (a multi-row text data set), even if no prior relationship between those concepts is known. In essence, these are "signal detection" algorithms that discover hidden relationships in large data sets, then present those relationships to the user through a series of easy-to-navigate web pages. This is a particularly difficult mathematical problem to solve because the number of possible relationships between any n discrete items in a large data set is essentially n squared. (It's actually n * (n-1) if you wish to get technical.) And that n squared data set must be compared against x content items, meaning the number of comparisons that must be made is actually xn(n-1) If n = 7000 and x = 21 million, then what you are dealing with is 49 million * 21 million comparisons, which is over one billion comparisons. As these are also free-form text comparisons, not numeric comparisons, they are especially processor intensive. It is the equivalent of writing a specialized search engine. In fact, unless altorithms are optimized, attempting to run raw text comparisons across the entire data set would take months of computing effort. Storing all the results of the signal detection algorithms is also tricky. We use an enterprise database system with a highly-optimized relational table structure that turns text searches into far-faster integer searches. The underlying database is designed in Transactional SQL (T-SQL) and runs on relatively modest equipment costing less than $10,000 / month to run. The architecture of the site, its executable code and servers means it has a capacity of over 100 million unique visitors per month, allowing for enormous growth as more people discover the value of scientific knowledge in realms of nutrition, superfoods and more. The underlying technology I've developed for this also has the theoretical ability to perform original human psyche research, such as sensing shifts in human emotions, outbreaks of pandemics, "fear factor" scores for economic collapse (which translates into trust factors for the highly-leveraged banking system) and so on. It has not yet been applied to those projects, but I'm always looking for new ways to expand human knowledge. In essence, we now have a universal "signal detection" technology that has now been applied to one particular data set (PubMed). This same technology could also be applied to many other data sets in order to tease out other relationships that represent psychosociological trends, mass media reporting trends and so on. For example, I believe this technology will be able to accurately predict the coming global banking collapse. We now have the technology to do this, and if time permits, we may pursue this project in the near future. Enjoy SCIENCE.NaturalNews.com and spread the word! Lots more studies are coming online every day, so the tool will become increasingly valuable and populated over time. - Mike Adams, the Health Ranger |