Chapter 5 Results
To address the questions raised in the chapter 1, we first analyzed Arrest
Dataset with respect to both time and space dimensions to obtain some fundamental and solid idea based on the official recorded data by NYPD. Then, we explore the Complaint
Dataset to further deepen our understanding obtained from the Arrest
Dataset, and we perform some comparison between the cases data to seek for more comprehensive insights. Finally comes the Hate Crime
Dataset, where we mainly perform exploration by space span, the result of which is enhanced by the famous D3
tools to enable dynamic illustration effect.
5.1 Crime Arrest Analysis
From 2006 to 2020, NYPD has recorded over 5.1 million arresting. We can derive our analysis on both time and space dimensions.
The Analysis for Arrest Data is as follow: we first we analyse the time span, then we analyse the geometry distribution of crimes. We combine the 2-dimension analysis for further insights at last.
For the time span part, we first perform a time analysis on the different level of offense crime (felony, misdemeanor, violation, and other). The whole time span is from 2006 to 2020.
For the geometry part, we first provide a overall visualization on the geometry distribution of the arrests. Then we zoom in to boroughs (Bronx, Staten Island, Brooklyn, Manhattan, and Queens).
For the analysis by combining time and geometry, we specifically select 2010, 2019, and 2020 as featured years. Then we derive crimes categories distribution analysis based on boroughs with these time window.
5.1.1 Time Analysis
We first perform some transformation on the arrest data set to enable our illustration for the trend of the data.
Then we first could illustrate in the line plot to see the trend of the overall incidents for each level of offense to see if there is any dramatic change in the pandemeic year.
As is shown in the graph, there are general declining trends on all categories of crimes from 2006 to 2020. The total number of all crime categories hits a historical low at 139 024 cases.
For misdemeanor arrest, the total number of crimes per year remains in a slowly increasing pattern before 2010, and this tendency brought the number to a historical height in 2010. After the peak, there is a 10-year declining period till 2020. From 2010 to 2014, the decreasing process is relatively smooth, only yielded 11.16% of cases decreasing. The number then realized a grammatical falling in 2015, a total of 220 169, which is a 15.19% decreasing year-on-year. Then move started a fast decreasing period. From 2015 to 2020, the percentage decreasing in misdemeanor is, astonishingly, 84.74%, or 186 591. The total number hit an 14-year high percent decrease year-on-year, 42.33%, to 73 015.
For felony, before 2009, the level of total number of felony arrest remains high, over 100k cases. Starting 2010, the number is stabilized between 90 000 and 100 000. A decreasing trend started in 2016. Till 2019, there is a 11.69%, or 11 038 decrease. Similar to the misdemeanor in 2020, a huge decreasing, which is 19.18% year-on-year or 18 106 cases occurred.
For violation, a raising was detected between 2006 and 2009. From 2009 to 2014, there is a 5-year smooth period that the total case each year barely change. Starting 2015, a falling started and reached a very low level in 2018 (3017). In 2020, just like misdemeanor and felony, the number jumped from 2822 to 549.
Then, we investigate the ratio of each kind of offense by years by looking at the stacked bar plot,
Over the years since 2016 to 2020, the misdemeanor level of offense remains as the most frequently happened events, its ratio suffers a significant decrease.
- In 2010, the portion of misdemeanor reached a historical high at 69.38%, over 2/3 of all the cases.
- In 2019, the number is 59.36%.
- In 2020, with the generalized decreasing in cases of all categories, the number hit a historical low, at 52.52%, just above half the cases.
Moreover, we can witness the improvement in the safety of the community by noticing the fewer and fewer occupation of the violation class events.
5.1.2 Geometry Dimension: Overall Illustration
We marked all the locations of arrest in 2020 on map.
According to our interactive map, these area suffers greater cases reported than other area: Brooklyn downtown, Harlem in Manhattan, Morrisania in Bronx, Manhattan downtown.
Manhattan is generally safer than other boroughs (not including Staten Island because of population differences).
Harlem in Mannhattan has the biggest number in arrest.
The main function of our map is serving as a general guide of safety. One in New York City can figure out how exactly the social security situation was in one’s area. Further more, we specifically improved the readability by aggregating the arrests into their “most common centers”. As one zooming in/out the map, the aggregation level will adjust against one’s zooming scales, which means no matter what scale of arresting location (say, from boroughs to streets) information the one want, the map can always provide a great aggregation.
5.1.3 Geometry Dimension: Zoom in to Borough & Combining Time Series Analysis
We could zoom in to investigate the detailed spatial distribution of those offense incidents.
As mentioned in our time analysis above, year of 2010, 2015, and 2019 are special turning points/starting point of tendencies. And 2020 is exactly when the COVID-19 broke, which is our main target in this project. Based on our overall geometry distribution navigating tool above, we can analyze the distributions of crimes categories in the 5 boroughs of New York City.
In the 4 year time point, Brooklyn always has the highest arrest number among all 5 boroughs. Manhattan follows Brooklyn closely. Bronx and Queens follow Manhattan in order. And Staten Island always with lowest number. This pattern may related to population distribution,
All 5 boroughs did well in reducing the number of violations. In 2020, almost all arrests lie in misdemeanor and felony.
Both the misdemeanor and felony arrests decrease in all 5 boroughs through the years. However, the number of misdemeanor decrease much faster than the number of felony. In 2010, misdemeanor contribute to almost 70% of arrests. In 2019, this number in Brooklyn is around 50%. In 2020, this number in all 5 boroughs is approximate 50%.
Bronx, Brooklyn, and Manhattan achieved almost 50% cut in arrest from 2010 to 2019, while Queens only did 40%.
Although the number of arrest in Staten Island is relatively small comparing to other boroughs, Staten Island has made a surprising 60% cut in arrest number.
Even though Bronx has much lower total number of arrest in 2010, 2015, and 2019, in 2020, arrest number in Bronx (32 382) is almost the same as the number in Manhattan (32 609), where the difference divided by Manhattan is less than 1%. Manhattan performs much better than Bronx in cutting the total number considering the difference between the 2 boroughs in 2010 divided by Manhattan is over 11%, or 12k.
The same thing happened between Manhattan and Queens. In 2010, the numbers gap divided by Manhattan is over 32%, or 36k. However, in 2020, gap is merely 8.5%, or 2.7k.
5.2 Crime Complaint Analysis
To provide a more comprehensive understanding about the crime number trend, we import Complaints Data. We analyze this set as follow:
Draft trend dectection by solely analyzing this set.
Combine Arrest and Complaint Data sets
Provide a NLP analysis to address what are the exact contents of complaints
5.2.1 Time Analysis
Although Arrests Data shows a dramatic downfall during the years, the Complaints Data is different. Misdemeanor still shows material decreasing, but Felony and Violation does not. The complaint about violation even has a increasing trend starting 2011. We can check this further by combining the arrests and complaints data sets.
The 2 data sets have patterns as follow:
Felony and misdemeanor share the general tread of increasing/decreasing in different periods. That is to say, when arrest number falls, complaints number falls, too.
From 2006 to 2020, the number of violation in complaints and arrests diverge all the time. This category almost follows a contrary direction in number changes. From 2006 to 2011, as the actual arrests raise, the complaints slightly fall. From 2012 to 2020, as the arrests number dramatically falls, the complaints about violation present a upward trend.
We can proof our observations in a more accurate way. By calculating the ratio of arrest over complaint, we merged the 2 data sets into a comprehensive one. The smaller the ratio, the bigger the gap between complaints and arrests (complaints > arrests).
The most important pattern is, starting from 2013, a consecutive decreasing in Violation happened. If we regard the actual arrests as actual happened crimes, the social security powers have successfully rooted violation.
Both felony and misdemeanor displayed a relatively stable range before 2014. Starting from 2015, both of the ratios entered a decreasing period. The misdemeanor ratio implies a bigger gap between arrests and complains.
5.2.2 NLP Analysis
This word cloud displayed the actual contents of complaints in a decent way. The top 3 level of contents are mainly stealing and assaulting. Specifically:
Larceny is the most common complaints.
Petit, harassment, and assault make up the second level
The third level of common complaints was formed by offenses and mischief.
5.3 Hate Crime Analysis
Based on the heat map, we can derive analysis in a high-to-low manner:
Harlem in Manhattan has the most hate crimes.
Jamaica, Rochdale, Laurelton, Cambridge Heights, and Queens Village is very close to Harlem.
Area close to Broadway in Brooklyn, Area close to Belt Pkwy in Brooklyn, Flushing in Queens, Midtown & Upper Town in Manhattan has higher Hate Crimes number than most of other areas in New York City.
Staten Island has a very small number of Hate Crimes.