Encyclopedia Titanica

Classification of Titanic Passenger Data and Chances of Surviving the Disaster

arXiv

   Join Us and Hide Ads

While the Titanic disaster occurred just over 100 years ago, it still attracts researchers looking for understanding as to why some passengers survived while others perished. With the use of a modern data mining tools (Weka) and an available dataset we take a look at what factors or classifications of passengers have a persuasive relationship towards survival for passengers that took that fateful trip on April 10, 1912. The analysis looks to identify characteristics of passengers - cabin class, age, and point of departure – and that relationship to the chance of survival for the disaster.

by John Sherlock, Manoj Muniswamaiah, Lauren Clarke, Shawn Cicoria
Key Points

The article analyses the survivability of passengers aboard the Titanic using data mining techniques, specifically decision tree classification and clustering, through tools like Weka. A subset of the Titanic passenger data from Kaggle was used, normalised to nominal data for analysis. The study focused on identifying significant factors affecting survival, such as sex, cabin class, age, and point of embarkation.

Key findings include:

  • Sex: Being female was the most significant factor in survival, with women showing a higher likelihood of surviving.
  • Cabin Class: Passengers in first class had higher survival rates compared to those in lower classes, particularly third class.
  • Age Group: Adults aged 20–49 formed the largest group among those who perished, but the generalisation of age groups limited deeper insights.
  • Embarkation Point: Point of departure showed a weaker correlation with survival, but it appeared related to class distribution.

The study utilised a J48 decision tree classifier with a ~81% accuracy and a Simple K Means clustering algorithm for visualising relationships. While the findings suggest strong associations, the authors caution against inferring causality without further analysis.

The paper concludes by highlighting the need for additional research with the complete dataset and exploration of cross-classification dependencies to enhance the model’s accuracy and insights. It also reflects on the learning process, emphasising the importance of data preparation for effective analysis.

Find it on arxiv.org

Encyclopedia Titanica is not responsible for the content of external sites, and the availability of links may change.

About Research References on Encyclopedia Titanica
This item is not available to read on Encyclopedia Titanica, but we have included it as a reference, provided a brief summary of the key points, and linked to the original source to help readers interested in the finer details of the Titanic story.

Find Related Items

Survival Analysis Machine Learning Statistical Analysis Class and Survival Socioeconomic Factors Survival Rates

Contribute

  Get in touch