The client wishes to provide its users with a novel approach to analyzing news on a specific event or a topic and to improve user media literacy using modern NLP/AI- advanced analytics services and data visualization methods. In particular, the user should be able to identify and explore similarities and discrepancies between news texts produced and compare and contrast news articles by different media outlets. In particular, the same political event can be described quite differently depending on the affiliation or political orientation of the news outlet’s editorial board.
On the other hand, news on important events are re-published with no or minimal changes by many websites or other media (so-called syndicated news). Detection, clustering, and visualization of syndicated news can greatly save the reader time by reviewing only one article from a cluster.
We propose to build the data analytics platform with Silk Data’s technologies and free Open-Source technologies with permissive licenses. Our experience shows that the usage of Open-Source solutions as a base will greatly reduce development time and efforts while allowing us to provide great system quality and reliability.
In particular, open-source packages are at the backbone of modern AI, providing field-tested, flexible, and state-of-the-art models and algorithms.
The Silk Data team implemented this request in the form of a web development services and a Chrome browser extension with both user and administrative components. This news monitoring tools comparing news articles on the same topic and "highlights" to the reader the nuances of news presentation by different news agencies (when clicking on a link or button in the extension, a document map for that event opens).
The news analyzer allows comparing news articles between publications to identify common and/or specific information in news coverage. In this way, the user does not have to read every article to get into the research and proofread the main points. With this data mining algorithms, an article of 6000 characters can be reduced to 400 characters. The AI model will independently analyze other references on the given topic and highlight the main thing in the current article.
The solution has opinion mining analysis thanks to natural language processing (NLP). opinion mining applications for intelligent opinion analysis will allow you to detect details of sources that contain opinions (or sentiment analysis opinion mining), feelings, attitudes about an object or research.
Loading and analysis of news, and updates of news articles - content curation and trend analysis.
Detection of clusters of lexically and semantically similar articles on the same topic or event for fake news detection and media bias detection.
Pair-wise comparison of news articles and highlighting of most repeated passages and key points from other news articles. Showing the sources of identical information.
Display a list of known news events and social media analytics with filtering by event date, event type, and basic keyword search.
Detect social bots on media sites by analyzing user actions. When comparing news, not only text is analyzed, but also user actions on web pages related to the analyzed topic. You can use this feature to identify fake users who are spreading spam, working on their reputation, or performing other harmful actions.
Analysis and planning | Describing the functionality that the system should provide. Identifying all project requirements and documenting them in the project specification. |
Design | Transforming the business concept and project requirements into a technical vision of the product. Creating a project design, including project architecture, module division, project behavior description, deployment scheme, static class and domain model, test cases, database design, and API design. |
Implementation | Developing not only the source code but also updating all project documentation. Setting up the development environment, implementing functions according to the technical specification and project plan, updating project documentation to keep it current, and updating test cases. |
Integration | Conducting overall project integration by assembling all its parts and modules. Performing developer testing of the entire project, final review, and code updates. Once integration is complete, the project is ready for the stabilization phase. |
Stabilization | Preparing the testing environment, deploying the solution, executing test cases, and checking the product. The software testing team conducts comprehensive testing of the project, and the development team fixes bugs and makes improvements to get the build as a release candidate. |
Deployment | Delivering project results to the client, deploying the product and components in the client's environment, stabilizing deployment, transitioning the project to support mode, and obtaining final approval. |
Acceptance | During acceptance, the client conducts UAT (user acceptance testing). If any issues are discovered, a round of bug fixing is carried out. |
Analysis and planning
Describing the functionality that the system should provide. Identifying all project requirements and documenting them in the project specification.
Design
Transforming the business concept and project requirements into a technical vision of the product. Creating a project design, including project architecture, module division, project behavior description, deployment scheme, static class and domain model, test cases, database design, and API design.
Implementation
Developing not only the source code but also updating all project documentation. Setting up the development environment, implementing functions according to the technical specification and project plan, updating project documentation to keep it current, and updating test cases.
Integration
Conducting overall project integration by assembling all its parts and modules. Performing developer testing of the entire project, final review, and code updates. Once integration is complete, the project is ready for the stabilization phase.
Stabilization
Preparing the testing environment, deploying the solution, executing test cases, and checking the product. The software testing team conducts comprehensive testing of the project, and the development team fixes bugs and makes improvements to get the build as a release candidate.
Deployment
Delivering project results to the client, deploying the product and components in the client's environment, stabilizing deployment, transitioning the project to support mode, and obtaining final approval.
Acceptance
During acceptance, the client conducts UAT (user acceptance testing). If any issues are discovered, a round of bug fixing is carried out.
The sample diagram below demonstrates the visualization of a few news articles on the same topic. The dot’s color denotes the news’ source, the size corresponds to the article length in words and the relative position of dots shows the semantic similarity between the news.
In the next screenshot, a variant of the browser extension running with the news article is shown.
The current version of the news comparison app is optimized for 500 to 1000 news articles per event (and up to 15 000 news articles in rare exceptions).
Beyond news analysis, the technology described above can be applied to other applications, such as automated bot accounts use social media, because the bots usually post nearly-identical comments in multiple chats. Similarly, this technology can be used for the analysis of large collections of documents, such as the analysis of business contracts during due diligence.
News analysis is a solution that analyzes thousands of pages of information sources on the Internet by title, keywords of a certain topic.
Data analysis, in simple words, is an automated AI tool that automatically finds information according to specified parameters and compares content/data on pages for similarities and differences, identifies and visually displays the most common parts of a topic.
Social bot detection is the ability to detect and identify bots (not real users performing automated actions) on web pages that leave comments or perform other malicious actions.
Detecting malicious social bots is very easy with the implementation and use of automated web data analysis tools. These AI solutions can easily analyze and compare thousands of web pages and user actions on them.