Abstract: It is common for government and public datasets to include narrative fields, such as inspection reports, incident reporting, surveys, 911 calls, fire response, etc. In addition to categorical fields, such as datetime, location, demographics, these datasets tend to include a narrative description (e.g., what happened). It is typically in the narrative field that the most interesting data resides for the purpose of classifying. The problem, is that since the narrative is human interpreted and entered, each entry may be unique and if we use the whole entry as a single value, one will end up with an overfitted model that works only on the training data. In this presentation, I will cover how natural language processing techniques are used to convert narrative fields into categorical data. Level: Intermediate Requirements: One should know basics of linear regression models. No prior programming knowledge is required.