As the solution to big data platform that Shenzhen SVI Technology Co. Ltd. researches and develops independently, SBDP (SVI Bigdata Platform) includes ETL data collection, data storage, analysis and mining and business application. It is aimed to establish an efficient, safe, multidimensional, integrated, open and unified management-focused big data platform; provide a solution for one-stop data mining and analysis, ranging from terminal to cloud and then to server and covering full-service, full-platform, numbers-oriented and decision operation-supporting; establish “data-driven” development pattern; improve data operation system and implement big data operation center. Additionally, in terms of tactics, it is focused on improving video core values and competitiveness fully through applications such as operation optimization, management improvement, risk control and external service.
1. SBDP System
Diagram of Platform System Structure:
2. SBDP Advantages
2.1 Real-time Data Acquisition and Display
It is required to acquire and report all data on company’s service platform, terminal and server; transmit these data to big data platform for flow-type integration and processing and then gather them to data warehouse of platform, to display the operation of all business platforms on a real-time, digital and accurate basis.
2.2 Large-scale Scalable Storage and Calculation
Established on cloud platform, big data platform can implement super-large-scale storage and calculation. With superior performance, it is characterized with big-capacity large-cluster resources pool, automatic monitoring management, scalability, etc.
Security is reinforced via end-to-end data encryption transmission and multiple disaster preparation plans are available. On the big data platform, deliberate sabotage and attack can be prevented in advance and service experience can be improved through efficient mathematical model algorithm.
2.4 Deep Mining
With iterative analysis and processing of lots of data, a set of multidimensional, multidirectional and three-dimensional user profile system is established, so as to fully understand customers’ affection in a real sense; on the basis of that, contents are input and recommended precisely by combining machine learning algorithm; it identifies and classifies deep learning and intelligently classifies ground images and voices.
2.5 External service
Provide data service to rivals.
2.6 Decision Operation
Stronger decision making ability, insight and process optimization can be achieved via the powerful big data processing, combined with specialized processing. With digital guidance, for operation decision, team management, risk management & control, create more opportunities and meet greater challenges.
3. SBDP Functions
Technical Structure Diagram and Function Description:
3.1 ETL Data Acquisition
Data will be acquired to big data center and then transmitted into Kafka distribution-type message subscription system, which can implement temporal order-based message persistence and high throughput; it supports data synchronous and asynchronous processing and horizontal expansion; realizes smooth docking of Flume data collection framework; reports various protocols and makes docking of various data with openness of custom interface. Kafka message is classified by Topic; those sending message will be the Producer and those receiving message will be the Consumer. By relying on Zookeeper, system availability can be ensured and some Meta message can be saved; based on TCP, communication is very lightweight.
3.2 Data Storage and Calculation
Equipped with HDFS (Hadoop Distribution-type File System) as storage scheme, big data center is applicable to saving of multiple copy mechanisms to avoid data loss. It is operated on low-cost machines and is suitable for storage of big data. With distribution-type structure, the center enjoys strong scalability and huge data throughput (available for GB, TB and even PB). For data storage format, the mainstream texts, such as json, csv, parquet are supported. For data calculation plan, Spark distribution-type calculation framework is adopted; its flexible MapReduce coding model and DAG (directed acyclic graph) task building mechanism are very suitable for data iterative computation; data could be completed in memory being written to disk; it enjoys ultra-low delay calculation character; even for offline computing, data are loaded from disk for once only, being advantageous.
3.3 Data Mining
Data mining could be realized via extracting data features, mathematical calculation model, statistics, on-line analysis and processing, machine learning, pattern recognition and deep learning; a set of all-around and three-dimensional user profile system is established, with the realization of characteristic functions such as statistical statement, accurate content recommendation, service forecast, decision support, risk management & control and demand mining.