Chapter 7 Conclusion
By using the R and D3 programming in our exploratory data analysis class. Three of us are able to perform a deep analysis on NYC public safety issue. We have generated some noteworthy findings and speculations.
7.1 Brief Conclusion
In brief, the analysis for Arrest
Dataset illustrates firstly in time span the amount and ratio trend of the incidents for each level of offense, and secondly in space span the offense level distribution of those incidents by borough. The key conclusion follows as:
The number of offense events decreases significantly through the years.
Over the years since 2016 to 2020, though the misdemeanor level of offense remains as the most frequently happened events, its ratio suffers a significant decrease.
Manhattan is generally safer than other boroughs (not including Staten Island because of population differences).
The exploration for Complaint
Dataset provides us with another insight to compensate for our perception on the previous dataset, where we focus on the analysis mainly in time span and word frequency:
Though the arrested events decrease significantly among years, the general complaint data level almost remains the same from 2016 to 2020. To be more specific, with respect to the arrest/complaint ratio, starting from 2013, a consecutive decreasing in Violation happened; Starting 2015, both of the felony and misdemeanor ratios entered a decreasing period.
The top 3 level of the complaint contents are mainly about stealing and assaulting.
7.2 Limitations
Limitation of the crime data: To obtain a more comprehensive idea about the post-pandemic crime phenomenon, we need a more detailed list of the data after the year 2021, which requries the updates of the data by NYPD.
Limitation of the implementation: Since export statements of JavaScript modules can not be used in embedded scripts, we have to include all variables, including constant variables in the same JavaScript files. This reduced the readability of our code for interactive part.
7.3 Future Directions
Deploy an automatic data fetch API and enable the functionality within our current implementation so that we could keep an automatic update to track the newest crime on map.
Beyond the EDAV of the NYPD data, we could further build up some of the models, especially in the NLP analysis and time series analysis to systematically remove the noises in the dataset.
Collect population data and income data corresponding to each neighborhood. So that we can discover if there is some correlation between population, incomes and crimes rates.