
Recent language research has focused on discovering truthfulness, deception, optimism, pessimism, and other metrics that reveal deeper insights into people and institutions based upon their writing. In financial markets, finding “self-referential” writings from institutions is often difficult, as the writings vary significantly based on the purpose of the public document. This makes it difficult to measure language changes as there is no baseline for comparison. For example, an investor presentation will differ wildly in format and content from a lawsuit notification to investors. Typically, the only structured notices that follow a format to allow consistent free-form writing are quarterly and annual earnings filings; however, these documents already lag in information as they report on historical findings and offer little future guidance.
We propose a solution that will allow firms to predict the “now” and provide immediate transparency while allowing enough constraints to establish a baseline of comparison: Press Releases.
Frequently at the bottom of press releases, firms will write a description about themselves – giving us a writing sample and the ability to find tonal and language cues. The consistency of these appearances solves the problem of providing apple-to-apple comparisons as the general purpose of the “About section” always stays the same. These sections also provide structure in what would otherwise be content that is too unconstrained to reliably produce a baseline comparison.
In addition to announcing the products and services they produce, companies often change how they describe themselves in press releases with respect to certain numerical data (e.g. number of employees), while also giving other tonal insights. Press releases are also more powerful than using simple SEC filings, as SEC filings are distributed too infrequently to rapidly measure change (quarterly, annually). Instead, we make it easier to earmark insights and flags related to these institutions before they make public distribution via filings.
The challenge with obtaining this data is that these sections can often be messy and not clearly delineated, making consistent extraction challenging. Tiingo Company Descriptions addresses this problem with a proprietary natural language engine that is ubiquitous throughout our platform.
To produce a database of company descriptions extracted from press filings, we use our extensive historical news database. We formed our own internal natural language engine to extract sections of the page that relate to the About section. We also added a few convenience fields to make data analysis on these descriptions easier. For example, we automatically parse the numerical references mentioned in each section of text, so you can more easily identify description-to-description changes. More details are outlined in the specification in the table that follows.
While the examples above have focused on using language to predict insights into the performance of assets, this dataset also gives the ability to track and map new products and services to companies. Descriptions are used to communicate updates about new business lines, products, services, and intellectual property. By extracting this information from description data, we can track product shifts and find deeper references to companies than by using other data extraction methods. For example, if Apple is to release a new product, the “iMonitor”, we can tag and link all writing samples, news articles, and other data that mention “iMonitor” to Apple Inc., even if the writings never mention Apple. This allows us to create software that can run due diligence on lesser-known product lines and find consumer insight separate from a company’s financial news.
We aim to have consistent uptime as well as some of the fastest severs possible with optimized in-memory caching, so we can deliver data quickly and consistently. Check out our independently-audited uptime here: Tiingo Uptime
We say quick-to-action, because anybody can respond to a ticket in 24 hours. But our team looks to solve issues immediately. Come say hello and check it out: E-mail us at [email protected] and see our response times.