Life of a Call Data Record

Life of a Call Data Record

A call detail record (CDR) is a data record produced by a telephone exchange or other telecommunications equipment that documents the details of a telephone call or other telecommunications transaction .CDRs represent a wealth of information that can be mined in order to discover patterns related to both calling behavior and service feature usage (e.g., SMS, MMS, etc.). The majority of CDRs are generated by the telecommunication switches  and Intelligent Network (IN) call processing nodes that handle specific service transactions, such as toll free telephone number translation and mobile number portability. CDR files are fetched from switches and IN call processing nodes on a periodic basis (e.g., every two minutes) and stored in a staging area that offers reliable near-term storage functionality since telephone switches and other network elements have limited storage capacity. Generally, the near-term CDR storage component does not provide any advanced CDR analysis functionality because the emphasis is on billing (including real-time charging) which requires efficient and timely off-loading of the original CDR files from inline session management network components.

Enabling near real-time CDR mining and analysis requires  a scalable infrastructure . Though many of the telecoms have adopted big data systems to handle  the high volume problem, many still struggle to make it near real time since but most of them  still move the data from the near-term storage component and inserting them into longer term storage (for billing applications as well as data warehouses for decision support and business intelligence)  . Such extractions are typically performed by customized Extract Transform and Load (ETL) flows.  These ETL flows must address many challenges not typically found in traditional data warehousing projects.  In particular:

  • Flexible CDR Processing :  The systems should be adaptable to handle introduction of new service CDRs , changes to existing CDR attributes , support multiple versions in a timely and reliable manner and with minimum changes in processing flows.
  • Scalable Processing not only Storage:  With the introduction  of Big Data  technologies , systems now have the capacity to process high volumes of CDR’s but still have to overcome the challenges of moving data from one storage to another.
  • Variety of Services : Challenges occur when  either multiple services co-exist or different versions of the same services are being offered to different subscriber groups.This is an important requirement where the same data needs to be presented to different tools or departments within the telecom each having their own way to store and analyse data .
  • Legacy problems: Another major challenge is to adapt to legacy systems . Since lots of processes have been developed over time ,changing software is not that easy . Most of the transformation and processing pipelines are dependent on the format of data which are very specific in nature .

To solve these problems we have to think of a new age system which is scalable and reliable and at the same time easily adaptable to legacy operations . The first thing we have to do is to decouple the storage from the processing part . This is a great concept to create scalable and real time applications . If you can decouple  the processing from your storage your  E(Extract)  , T( Transform )  and L ( Load ) can work in a asynchronous manner . Now coming to storage instead of thinking of two types of separate storages one for short term and another for long term what can be done is store all the records  using a distributed message queue architecture . For example,a single source of truth where all the records have been stored in system of records kind of format . Some of the other attributes which the storage should have is the ability to store flexible schema records of different CDR types and the time when the record was created. Modern day storages such as Kafka / MAPR Streams have now evolved to handle such requirements in a very cost effective and reliable manner . These systems can scale upto PB’s of data and be extremely scalable to handle varied application workloads .

Whiteklay’s Izac has been developed on the above mentioned philosophy  and provides a robust  framework which can act as a single source of truth within a telecom environment  . Built on an open source technology it helps organizations to easily ingest,store and process billions of call records  within a telecom .Izac’s flexible and scalable data hub framework enables  CSP’s( communication service provides )  to architect and run a variety of use-cases such as Information Lifecycle Management (ILM), Billing , Product/ pack recommendations , campaign management etc .

For more information, please contact us at – we’d be happy to help.

Share This Post