Altair Newsroom

Featured Articles

Digital Debunking: Was Data Analytics Able to Predict if Punxsutawney Phil Would See His Shadow?

Punxsutawney Phil, Pennsylvania’s legendary groundhog, made his annual appearance on February 2 to let Americans know that they should expect six more weeks of winter. For those unfamiliar with Groundhog Day: it’s an annual holiday observed since 1887 by many in the U.S. and Canada, where a groundhog named Punxsutawney Phil leaves his burrow. What he sees when he emerges determines his prediction for the late winter/early spring. If he sees his shadow, that means we’ll have six more weeks of winter. If he doesn’t, it means we’re in for an early spring. There’s an entire ceremony dedicated to this event that draws large crowds to the small town of Punxsutawney, PA each year.

We should note that although Wikipedia calls Phil a “semi-mythical groundhog,” his predictions are more educated guess than mathematical certainty. Phil’s predictions have been correct about 40% of the time. However, it’s still a fun way to speculate on how early spring weather will arrive. This year, Phil saw his shadow, meaning he thinks we’re in for six more grueling weeks of winter weather. This TikTok gives you an idea of what goes on (kind of). 

But before we at Altair knew what Phil’s prediction would be, our team wondered if we could use Altair RapidMiner, our data analytics and artificial intelligence (AI) platform, to predict whether Phil would see his shadow. Altair RapidMiner allows users of all skill levels to go from data analytics idea to adoption, with no-code functionalities via auto ML, code-optional functionalities via the workflow builder, and full-code functionalities. 

While Phil’s accuracy wasn’t in consideration here, we thought this would be a fun way to put Altair RapidMiner to the test. 


Building a Data Model

A graph of the machine learning model in Altair RapidMiner
The machine learning model in Altair RapidMiner software


To start, we imported an Excel data table consisting of Punxsutawney, PA’s weather data over the last decade into Altair RapidMiner. We then used a decision tree to build a machine learning model and performed a cross-validation to check the model’s accuracy and used it to predict whether the Phil would see his shadow. 

Image of Altair RapidMiner's model prediction
Altair RapidMiner’s model prediction


The next image shows the prediction in the column, “prediction (Cast Shadow?)” in the single row table. We used the forecast for February 2 to make this prediction.

A close-up image of the Altair RapidMiner model prediction
A closer look at the Altair RapidMiner prediction entered into the model


For a better look at the set-up, we zoomed in on the table above.

An image of the Altair RapidMiner model's data cross-validation results
Cross-validation results of the data run


In this last image, we can see that the cross-validation results state the accuracy to be roughly 55%. The accuracy of the model has an “error margin” +/- 52.22% over the 55% number, which tells us that the model didn't have much confidence in predicting the correct answer. However, the model did predict that Phil would cast a shadow and see it this year. Even though the results worked in our favor, we know we couldn’t actually predict the outcome of Groundhog Day. In any case, Altair RapidMiner's predictions aren’t nearly as entertaining as the spectacle of a man in a top hat reading written weather predictions from a mythical groundhog. 

On the other hand, we’re confident Altair RapidMiner is better than Punxsutawney Phil for gaining data-driven insights, and scaling, extracting, and studying data trends in an organization. Available through our unique Altair Units licensing model, Altair RapidMiner reinforces Altair’s existing data analytics portfolio by providing a stronger experience for users to understand and transform data.

Oh, and bundle up – six more weeks of winter to go!

Click here to learn more about Altair’s data analytics solutions.