Type: Process Essays
Sample donated: Margie Alvarado
Last updated: October 2, 2019
Overview of Technologies used forBig Data Applications During a second, telecommunication providers handlemillions of calls and texts. They have their hands on the biggest data sets inthe world. Not just calls and texts, information such as music and videosadding up people today are using more than 3GB on average compared to 450 MB in2012 9 Since industry uses huge amount of data which cannotbe stored or processed using single storage or processor at once, big datatechnologies come into play.According to reports, big data market intelecommunication is growing at 28.28% CAGR to 2018.
10 Below is a discussion about some of the main big datatechnologies used. 2.1Apache Hadoop Figure 01:Hortonworks Data Platform Many telco services uses Hortonworks to process theirdata11 and use Hadoop to improve customer interactions.Hadoop cluster canprocess real time call recordings which can be used to recognize patterns andcall drops and quality of the call.
At one given time frame millions of callswill be taken and solutions such as APACHE FLUME can ingest these millions ofrecords in to Hadoop. And APACHE STORM can process these data and give detailsabout call quality and drops.Network logs and call center details give detailsabout the bandwidth, network load in specific area. Information processed fromthose data can be used to make business decisions about expansions, networktraffic and real-time bandwidth allocation. Hadoop can be used to storereal-time streaming unstructured data which can be used to reduce cost formaintenance, infrastructure development and reduce network traffic. Even cloudplatforms such as Microsoft HDInsight also uses Apache Hadoop and Apache Spark.
2.2NoSql Telco services are using distributed data and theyface Consistency Availability Partition (CAP) theorem which means only twofactors can be achieved at given time12. Most traditional RDBMS are CA sidedwhich means they don’t support partitioning much. But NoSQL is built forhorizontal scalability. Which basically supports Consistency, Partition andAvailability, Partition or CAP.
NoSQL powered open source, distributed HBase canreplace home location repositories which contain data about sim cards issued.NoSQL makes data highly available and can be scaled linearly. Also using NoSQLcomplex queries can be performed on larger data sets with in less timeconsumption. In prodder to process data techniques such as Mapreduce, Parallelscanning can be used. Even at node failure HBase can perform a networkpartitioning and keep the consistency or availability of data.
Figure 02: AnIn-Depth Look at the HBase Architecture 2.3Hive Hive is also a distributed SQL storage13 that can beused for data summarization, analysis and query.In telecommunication industryCDR or caller details record are important data which contains valuableinformation which can be used to derive business decisions. Some of the datawhich can be retrieved using CDR are, subscriber phone number, receiver phonenumber, duration, billing number, call type, sim ID, location etc.
Hive can be used to create virtual tables withprocessed data to perform predictions about calls and network traffic. Also,telco companies use Hive to perform operations on time framed data sets to getdetails corresponding to the time.2.4Streaming telecommunication event data analytics Telco providers need data to run specific campaigns,improve interaction and detect fraud. To achieve those above, application needsto process streaming unstructured mass data14.
Main advantage of TEDAapplications are that these applications are decoupled from external sources.So operators can work even when database is under maintenance or with severaltypes of databases. Figure 03:Architecture of a Telecommunications Event Data Analytics solution These applications accept inputs from various sourcesand validates those details and perform business intelligence on processeddata.
Eg. TEDA can be used for real-time billing andaccounting for calls.