Analysis of Women Safety in Indian Cities Using Machine Learning on Tweets

Women and girls have been experiencing a lot of violence and harassment in public places in various cities starting from stalking and leading to sexual harassment or sexual assault. This research paper basically focuses on the role of social media in promoting the safety of women in Indian cities with special reference to the role of social media websites and applications including Twitter platform Facebook and Instagram. This paper also focuses on how a sense of responsibility on part of Indian society can be developed the common Indian people so that we should focus on the safety of women surrounding them. Tweets on Twitter which usually contains images and text and also written messages and quotes which focus on the safety of women in Indian cities can be used to read a message amongst the Indian Youth Culture and educate people to take strict action and punish those who harass the women. Twitter and other Twitter handles which include hash tag messages that are widely spread across the whole globe sir as a platform for women to express their views about how they feel while we go out for work or travel in a public transport and what is the state of their mind when they are surrounded by unknown men and whether these women feel safe or not?


INTRODUCTION
There are certain types of harassment and Violence that are very aggressive including staring and passing comments and these unacceptable practices are usually seen as a normal part of the urban life. There have been several studies that have been conducted in cities across India and women report similar type of sexual harassment and passing off comments by other unknown people. The study that was conducted across most popular Metropolitan cities of India including Delhi, Mumbai and Pune, it was shown that 60 % of the women feel unsafe while going out to work or while traveling in public transport.
Women have the right to the city which means that they can go freely whenever they want whether it be too an Educational Institute, or any other place women want to go. But women feel that they are unsafe in places like Vol11 Issue 06, Nov 2021 Page 752 malls, shopping malls on their way to their job location because of the several unknown Eyes body shaming and harassing these women point Safety or lack of concrete consequences in the life of women is the main reason of harassment of girls. There are instances when the harassment of girls was done by their neighbors while they were on the way to school or there was a lack of safety that created a sense of fear in the minds of small girls who throughout their lifetime suffer due to that one instance that happened in their lives where they were forced to do something unacceptable or was sexually harassed by one of their own neighbor or any other unknown person. Safest cities approach women safety from a perspective of women rights to the affect the city without fear of violence or sexual harassment.
Rather than imposing restrictions on women that society usually imposes it is the duty of society to imprecise the need of protection of women and also recognizes that women and girls also have a right same as men have to be safe in the City. Analysis of twitter texts collection also includes the name of people and name of women who stand up against sexual harassment and unethical behavior of men in Indian cities which make them uncomfortable to walk freely. The data set that was obtained through Twitter about the status of women safety in Indian society was for the processed through machine learning algorithms for the purpose of smoothing the data by removing zero values and using Laplace and porter's theory is to developer method of analyzation of data and remove retweet and redundant data from the data set that is obtained so that a clear and original view of safety status of women in Indian society is obtained.

Objective of the Project:
Results of the sentimental analysis can be used in many areas like sentiments regarding a particular brand or release of a product, analyzing public opinions on the government policies, people's thoughts on women, etc. In order to perform classification of tweets and analyze the outcome, a lot of study has been done on the data obtained by twitter. We also review some studies on machine learning in this paper and research on how to perform sentiment analysis using that domain on twitter data.

Scope of the Project:
The project scope is restricted to machine learning algorithms and models. Staring at women and passing comments can be certain types of violence and harassment and these practices, which are unacceptable, are usually normal especially on the part of urban life. Many researches that have been conducted in India shows that women have reported sexual harassment and other practices as stated above. Such studies have also shown that in popular metropolitan cities like Delhi, Pune, Chennai and Mumbai, most women feel they are unsafe when surrounded by unknown people.

Advantages
Vol11 Issue 06, Nov 2021 Analysis of twitter texts collection also includes the names of people and names of women who stand up against abuse, harassment and unethical behavior of men in Indian cities which make them uncomfortable to walk freely. The data set that was obtained through Twitter about the status of women safety in Indian society.

Disadvantages
Twitter and Instagram point and most of the people are using it to express their emotions and also their opinions about what they think about the Indian cities and Indian society. There are several methods of sentiment that can be categorized like machine learning hybrid and lexicon-based learning. Also there is another categorization presented with categories of statistical, knowledge-based and age wise differentiation approaches.

Software Requirements:
The functional requirements or the overall description documents include the product perspective and features, operating system and operating environment, graphics requirements, design constraints and user documentation.

Hardware Requirements:
Minimum hardware requirements are very dependent on the particular software being developed by a given Enthought Python / Canopy / VS Code user. Applications that need to store large arrays/objects in memory will require more RAM, whereas applications that need to perform numerous calculations or tasks more quickly will require a faster processor.

LITERATURE SURVEY Existing System:
The concept to analyze women safety using social networking messages and by applying machine learning algorithms on it. Now-a-days almost all peoples are using social networking sites to express their feelings and if any women feel unsafe in any area then she will express negative words in her post/tweets/messages and by analyzing those messages we can detect which area is more unsafe for women.

Proposed System:
The proposed work uses the TWEEPY package from python to download tweets from twitter but every time INTERNET will not be available to download tweets online so we downloaded MEETOO tweets on women safety and safety inside the dataset folder. Application will read these tweets to detect women's sentiments.
We use NLTK (natural language toolkit) to remove special symbols and stop words from tweets and to make them clean.
Also, we use TEXTBLOB corpora package and dictionary to count positive, negative and neutral polarity and the tweets which has polarity value less than 0 will consider as negative as and greater than 0 and less than 0.5 will consider as neutral and polarity greater than 0.5 will consider as positive.

MODULES
• Upload dataset: Using this module we will upload dataset • Dataset cleaning: Using this module we will find empty values in the dataset and replace them with mean or 0 values.
• Train & Test Split: Using this module we will split the dataset into two parts called training and testing. All machine learning algorithms take 80% dataset to train classifiers and 20% dataset is used to test classifier prediction accuracy.

RESULTS
Step 1: import libraries (where tkinter used for GUI(front-end),text blob -processing textual data, matplotlib -data visualization,pandasdata analysis and preprocessing,numpy -mathematical purpose,nltk -building python program( remove special symbols and stop words )) Step 2: Defining main function and setting the title & size of tkinter Step 6: Defining tweet cleaning function In the above screen each line represents one tweet and you can scroll down above the text area to view all tweets. In the above screen we can see all tweets containing special symbols and stop words and to clean those tweets click on 'Tweets Cleaning' button. In the above screen each tweet has tweet text and then displaying tweets sentiments with polarity score. Scroll down above the text area to see all tweets. Now click on the 'Women Safety Graph' button to get the results below and by seeing that result, the user can easily understand whether the area is safe or not. If the area is safe then more peoples will express either positive or neutral tweets and if not safe then more peoples will discuss negative tweets. In the above screen 0.74 multiplied by 100 will give 74% which means 74% people are talking negative and the area is not safe and only 22 and 3% people are talking positive.

CONCLUSION
Throughout the research paper we have discussed various machine learning algorithms that can help us to organize and analyze the huge amount of Twitter data obtained including millions of tweets and text messages shared every day. These machine learning algorithms are very effective and useful when it comes to analyzing large amounts of data including the SPC algorithm and linear algebraic Factor Model approaches which help to further categorize the data into meaningful groups. Support vector machines is yet another form of machine learning algorithm that is very popular in extracting Useful information from Twitter and getting an idea about the status of women safety in Indian cities.

Future Enhancement:
For the future enhancement, we can extend to apply these machine learning algorithms on different social media platforms like facebook and instagram also since in our project only twitter is considered. Present ideology which is