The current research resolves to compare films in a context that involves the identification of similarities in the plots, which is accomplished through the use of clustering methods that help in grouping similar film narratives together. Setting out our common goal of using a vast sample of movies to identify patterns and come up with the conclusions regarding the factors that define the genres of films and the choice of the public, we proceed to the discussion of individual projects.
Our methodology involves:
- Data Collection: Gathering information of different movie and its synopsis, its genre and year it was released.
- Text Analysis: Applying NLP on the plot descriptions to identify significant features of the movies.
- Clustering: Using the concept of clustering algorithms including the hierarchical clustering and k means to cluster the movies according to the plots.
The results show the existence of meaningful sets of films with similar patterns of the narrative structure, especially stressing the similarities in genre and the focus on themes. Moreover, we identify the outliers and the films that cannot be classified into the mentioned groups — this helps to understand the spectrum and the variety of cinematographic stories.
The given results help the producers and marketers of films to understand the audiences and their demands in order to develop appropriate kinds of motion pictures. The study also has implications for recommendation systems in movies adding to their capabilities to recommend films based on the plot. Also, the investigation helps to expand the general understanding of the narrative paradigms and the classification of genres in the context of cinema.