Blog Post
How To Train Machine Learning Models
Training a machine learning model? In this post, our expert engineer details what all is required to effectively train a machine learning model and details some best practices for effectively training machine learning models.
Read More
Blog Post
Sensitivity, Specificity and Disease Testing
Disease testing isn’t perfect. Illustrating how the prevalence of a disease, including COVID-19, can be estimated using an imperfect test and how probability depends on the prevalence, this blog highlights how a single test result should only be one piece of information used in determining an individual’s infection status.
Read More
Blog Post
Why Sample Size and Random Sampling Matters
Recently we tweeted an interesting article on big data, from the Financial Times. The author’s key point is that sampling bias and sampling error are possible even with large data sets. As illustration, the author discusses a classic case where the Literary Digest incorrectly predicted that Alf Landon would beat FDR in the 1936 election….
View Article
Read More
Blog Post
Big Data, Probability and Birthdays: Part 2 of 2
In Part One of this blog post, I discussed how to state an experiment in the form of probability spaces. Determining the sample space and the event space is necessary to be able to talk intelligently about probability measures, which is the topic of this post. Approach 1: Counting We’ve figured out the sample space…
View Article
Read More
Blog Post
Big Data, Probability and Birthdays: Part 1 of 2
Cardinal Peak’s big data practice is expanding as we continue adding data scientists to our staff. In a recent discussion regarding a data set we’re analyzing, a probability problem conceptually equivalent to the following arose: In a room filled with N people, what is the probability that none of them have the same birthday? In…
View Article
Read More