CN104077359A - Data cleaning and integrating intelligent system - Google Patents

Data cleaning and integrating intelligent system Download PDF

Info

Publication number
CN104077359A
CN104077359A CN201410246840.6A CN201410246840A CN104077359A CN 104077359 A CN104077359 A CN 104077359A CN 201410246840 A CN201410246840 A CN 201410246840A CN 104077359 A CN104077359 A CN 104077359A
Authority
CN
China
Prior art keywords
data
terminal
crm
cloud storage
platform
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410246840.6A
Other languages
Chinese (zh)
Inventor
胥斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NANJING ZHIKU BUSINESS CONSULTATION Co Ltd
Original Assignee
NANJING ZHIKU BUSINESS CONSULTATION Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NANJING ZHIKU BUSINESS CONSULTATION Co Ltd filed Critical NANJING ZHIKU BUSINESS CONSULTATION Co Ltd
Priority to CN201410246840.6A priority Critical patent/CN104077359A/en
Publication of CN104077359A publication Critical patent/CN104077359A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors

Abstract

The invention discloses a data cleaning and integrating intelligent system which comprises a database unit, a cloud storage platform, an artificial intelligent data platform and a terminal. The cloud storage platform collects various data sources and relevant information having set membership with the data sources, establishes logical relation, performs comparative analysis with the database unit, performs correction matching by using upmost layer information in the set membership, performs one-by-one downward correction matching and performs arithmetic encryption and storage on matched data. The artificial intelligent data platform performs a series actions including data audit, data migration, data capture, data cleaning, data extraction and reports, so that data compilation is finished, and data consistency, integration and correctness are ensured.

Description

Data cleansing and integration intelligent system
Technical field
The present invention relates to a kind of data cleansing and integrate intelligent system.
Background technology
Large data are sunrise industries, but its utilization is also in initial stage, is that the data volume that industry self accumulates is insufficient because enterprise is not deep enough to the understanding of large data processing on the one hand, in relatively limited data, cannot extract the valuable information to enterprise; Because the comparatively experience of ripe large data analysis processing of nothing on the other hand, because available data analytical technology rests on the aspect of data display substantially, too many increment information and intelligence suggestion can not be provided, enterprise still will paddle one's own canoe to carry out decision-making, from extracting data be worth ability too a little less than.
Enterprise is in the budding stage to the demand of the large data analysis solution of commercialization, and present situation is that enterprise self feels simply helpless to the mass data day by day adding up.Often both not known how to analyze also not know what the target of analyzing is.Under the overall background of national industry upgrading, all kinds of enterprises all, attempting innovation, provide high value-added product and service.How to utilize existing data to help in time, effectively, automatically and the decision-making of science becomes the embodiment of enterprise core competence day by day.Following enterprise will be more and more stronger to the dependence of data analysis, and this place, great market space of data analysis just.
Along with the generation of cloud concept, enterprise has had the ability to create the cloud platform of oneself at present, and the collection of large data has become possibility with storage, how cloud platform is applied to the urgent problem that becomes current research in the self-growth for enterprise.
Summary of the invention
Goal of the invention: the object of the invention is in order to solve the deficiencies in the prior art, adapt to the growth requirement of the data processing of accumulating over a long period, provide a kind of management flexibly, the high and quantity of information data cleansing accurately of efficiency with integrate intelligent system.
Technical scheme: data cleansing of the present invention with integrate intelligent system, its objective is such realization,
A kind of data cleansing and integration intelligent system, comprising:
Database Unit: according to the required structure Database Unit of industry, and set up index;
Cloud storage platform: collection data source and this data source have relevant information the construction logic relation of set membership, contrasting data library unit, revise and mate also the coupling of downward revision one by one with the superiors' information in this set membership, the data that match are carried out to algorithm for encryption storage;
Artificial intelligence data platform: the data that are stored in cloud storage platform are carried out to Data Audit, and in conjunction with terminal calling rule, audit recommendation is proposed, available data specification is become to be applicable to the form of CRM application, set up the CRM database of terminal according to calling rule, rear audit and normalized available data are migrated in the CRM database of terminal, for CRM application provides data basis;
Terminal: for terminal provides most suitable data fetching, guarantee that data integrity is written into CRM database, the data that grab in unit interval are carried out to data cleansing according to specification, guarantee data fit CRM using standard, be integrated into the form of daily form, according to terminal requirements, carry out data pick-up temporarily, according to terminal requirements, provide form as required.
The mainstream data source of ASCII text file, XML file, Excel form document is exported to SQL server, Oracle, Teradata by described cloud storage platform, and be transferred to cloud storage platform by the mode of Sterling File Gateway, FTP/SFTP/HTTPS.
Beneficial effect: data cleansing and integration platform are realized data acquisition and the distribution of sharing data center, provide the data exchange service such as warehouse-in are cleaned, change, loaded to exchanged information, clear up dirty data, the arrangement of complete paired data, guarantees data consistency, integrality and correctness.
Each operation system is carried out exchanges data and is shared by cleaning and integration system and shared data center platform, and each operation system independent operating, is independent of each other, and a certain operation system fault can not cause the impact on other system.
Embodiment
In order to deepen the understanding of the present invention, below in conjunction with embodiment, the invention will be further described, and this embodiment only, for explaining the present invention, does not form limiting the scope of the present invention.
A kind of data cleansing and integration intelligent system, comprising:
Database Unit: according to the required structure Database Unit of industry, and set up index;
Cloud storage platform: collection data source and this data source have relevant information the construction logic relation of set membership, contrasting data library unit, revise and mate also the coupling of downward revision one by one with the superiors' information in this set membership, the data that match are carried out to algorithm for encryption storage;
Artificial intelligence data platform: the data that are stored in cloud storage platform are carried out to Data Audit, and in conjunction with terminal calling rule, audit recommendation is proposed, available data specification is become to be applicable to the form of CRM application, set up the CRM database of terminal according to terminal calling rule, rear audit and normalized available data are migrated in the CRM database of terminal, for CRM application provides data basis;
Terminal: for terminal provides most suitable data fetching, guarantee that data integrity is written into CRM database, the data that grab in unit interval are carried out to data cleansing according to specification, guarantee data fit CRM using standard, be integrated into the form of daily form, according to terminal requirements, carry out data pick-up temporarily, according to terminal requirements, provide form as required.
Referring to Fig. 1, the structure flow process of system of the present invention is as follows:
The first step: building database, the certain industry information of collecting as required, builds Database Unit, and sets up index;
Second step: data analysis, collection data source and this data source have relevant information the construction logic relation of set membership, contrasting data library unit, revise and mate also the coupling of downward revision one by one with the superiors' information in this set membership, the data that match are carried out to algorithm for encryption storage;
The 3rd step: Data Audit, the data that are stored in cloud storage platform are carried out to Data Audit, and propose audit recommendation in conjunction with terminal calling rule, available data specification is become to be applicable to the form of CRM application;
The 4th step: Data Migration, set up the CRM database of terminal according to terminal calling rule, by after audit and normalized available data migrate in the CRM database of terminal, for CRM application provides data basis;
The 5th step: data capture, and for terminal provides most suitable data fetching, guarantee that data integrity is written into CRM database;
The 6th step: data cleansing, the data that grab in the unit interval are carried out to data cleansing according to specification, guarantee data fit CRM using standard, be integrated into the form of daily form;
The 7th step: data pick-up and form, according to terminal requirements, carry out data pick-up temporarily, according to terminal requirements, provide form as required.
The foregoing is only preferred embodiment of the present invention, in order to limit the present invention, within the spirit and principles in the present invention not all, any amendment of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims (2)

  1. Data cleansing with integrate an intelligent system, its spy is, described system comprises:
    Database Unit: according to the required structure Database Unit of industry, and set up index;
    Cloud storage platform: collection data source and this data source have relevant information the construction logic relation of set membership, contrasting data library unit, revise and mate also the coupling of downward revision one by one with the superiors' information in this set membership, the data that match are carried out to algorithm for encryption storage;
    Artificial intelligence data platform: the data that are stored in cloud storage platform are carried out to Data Audit, and in conjunction with terminal calling rule, audit recommendation is proposed, available data specification is become to be applicable to the form of CRM application, set up the CRM database of terminal according to calling rule, rear audit and normalized available data are migrated in the CRM database of terminal, for CRM application provides data basis;
    Terminal: for terminal provides most suitable data fetching, guarantee that data integrity is written into CRM database, the data that grab in unit interval are carried out to data cleansing according to specification, guarantee data fit CRM using standard, be integrated into the form of daily form, according to terminal requirements, carry out data pick-up temporarily, according to terminal requirements, provide form as required.
  2. 2. data cleansing according to claim 1 and integration intelligent system, it is characterized in that, the mainstream data source of ASCII text file, XML file, Excel form document is exported to SQL server, Oracle, Teradata by described cloud storage platform, and be transferred to cloud storage platform by the mode of Sterling File Gateway, FTP/SFTP/HTTPS.
CN201410246840.6A 2014-06-05 2014-06-05 Data cleaning and integrating intelligent system Pending CN104077359A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410246840.6A CN104077359A (en) 2014-06-05 2014-06-05 Data cleaning and integrating intelligent system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410246840.6A CN104077359A (en) 2014-06-05 2014-06-05 Data cleaning and integrating intelligent system

Publications (1)

Publication Number Publication Date
CN104077359A true CN104077359A (en) 2014-10-01

Family

ID=51598613

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410246840.6A Pending CN104077359A (en) 2014-06-05 2014-06-05 Data cleaning and integrating intelligent system

Country Status (1)

Country Link
CN (1) CN104077359A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104317624A (en) * 2014-11-04 2015-01-28 南京联创科技集团股份有限公司 Plug-in processing based data assembling method
CN106933990A (en) * 2017-02-21 2017-07-07 南京朴厚生态科技有限公司 A kind of sensing data cleaning method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080162518A1 (en) * 2007-01-03 2008-07-03 International Business Machines Corporation Data aggregation and grooming in multiple geo-locations
CN101969475A (en) * 2010-11-15 2011-02-09 张军 Business data controllable distribution and fusion application system based on cloud computing
CN102495885A (en) * 2011-12-08 2012-06-13 中国信息安全测评中心 Method for integrating information safety data based on base-networking engine
CN103455636A (en) * 2013-09-27 2013-12-18 浪潮齐鲁软件产业有限公司 Automatic capturing and intelligent analyzing method based on Internet tax data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080162518A1 (en) * 2007-01-03 2008-07-03 International Business Machines Corporation Data aggregation and grooming in multiple geo-locations
CN101969475A (en) * 2010-11-15 2011-02-09 张军 Business data controllable distribution and fusion application system based on cloud computing
CN102495885A (en) * 2011-12-08 2012-06-13 中国信息安全测评中心 Method for integrating information safety data based on base-networking engine
CN103455636A (en) * 2013-09-27 2013-12-18 浪潮齐鲁软件产业有限公司 Automatic capturing and intelligent analyzing method based on Internet tax data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
代斌: "基于数据接口标准的数据采集分析技术", 《审计研究》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104317624A (en) * 2014-11-04 2015-01-28 南京联创科技集团股份有限公司 Plug-in processing based data assembling method
CN104317624B (en) * 2014-11-04 2017-06-06 南京联创科技集团股份有限公司 Data assembly method based on plug-in unit treatment
CN106933990A (en) * 2017-02-21 2017-07-07 南京朴厚生态科技有限公司 A kind of sensing data cleaning method

Similar Documents

Publication Publication Date Title
CN107145586B (en) Label output method and device based on electric power marketing data
CN104820670B (en) A kind of acquisition of power information big data and storage method
CN104462314B (en) Power grid data processing method and device
CN104318481A (en) Power-grid-operation-oriented holographic time scale measurement data extraction conversion method
CN102902752A (en) Method and system for monitoring log
CN106709035A (en) Preprocessing system for electric power multi-dimensional panoramic data
CN110750650A (en) Construction method and device of enterprise knowledge graph
Nobre et al. Assessing the Role of Big Data and the Internet of Things on the Transition to Circular Economy: Part II: An extension of the ReSOLVE framework proposal through a literature review
CN112182077B (en) Intelligent operation and maintenance system based on data middling platform technology
CN102819589B (en) ETL (Extract Transform Load)-based data optimization method and equipment
CN102750367A (en) Big data checking system and method thereof on cloud platform
CN111090643B (en) Mass electricity consumption data mining method based on data analysis system
CN106296458A (en) Water utilities data processing method, device and water utilities data collecting system
SG10201702888XA (en) Platform for the integration of operational bim, operational intelligence, and user journeys for the simplified and unified management of smart cities
CN102495916A (en) Multi-application-system panoramic modeling method based on object matching
CN108287889B (en) A kind of multi-source heterogeneous date storage method and system based on elastic table model
CN102999528A (en) Method and device for ETL (Extract Transform and Load) task off-lining and data cleaning in data warehouse
CN106022640B (en) Electric quantity index checking system and method
CN103984723A (en) Method used for updating data mining for frequent item by incremental data
CN104077359A (en) Data cleaning and integrating intelligent system
CN114218291A (en) Portrait generation method, apparatus, device and storage medium based on target object
CN104361086A (en) Data integration method for measurable asset entire life-cycle management system
CN112883001A (en) Data processing method, device and medium based on marketing and distribution through data visualization platform
CN204790999U (en) Big data acquisition of industry and processing system
CN116842092A (en) Method and system for database construction and collection management

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20141001