CN104077359A - Data cleaning and integrating intelligent system - Google Patents
Data cleaning and integrating intelligent system Download PDFInfo
- Publication number
- CN104077359A CN104077359A CN201410246840.6A CN201410246840A CN104077359A CN 104077359 A CN104077359 A CN 104077359A CN 201410246840 A CN201410246840 A CN 201410246840A CN 104077359 A CN104077359 A CN 104077359A
- Authority
- CN
- China
- Prior art keywords
- data
- terminal
- crm
- cloud storage
- platform
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
Abstract
The invention discloses a data cleaning and integrating intelligent system which comprises a database unit, a cloud storage platform, an artificial intelligent data platform and a terminal. The cloud storage platform collects various data sources and relevant information having set membership with the data sources, establishes logical relation, performs comparative analysis with the database unit, performs correction matching by using upmost layer information in the set membership, performs one-by-one downward correction matching and performs arithmetic encryption and storage on matched data. The artificial intelligent data platform performs a series actions including data audit, data migration, data capture, data cleaning, data extraction and reports, so that data compilation is finished, and data consistency, integration and correctness are ensured.
Description
Technical field
The present invention relates to a kind of data cleansing and integrate intelligent system.
Background technology
Large data are sunrise industries, but its utilization is also in initial stage, is that the data volume that industry self accumulates is insufficient because enterprise is not deep enough to the understanding of large data processing on the one hand, in relatively limited data, cannot extract the valuable information to enterprise; Because the comparatively experience of ripe large data analysis processing of nothing on the other hand, because available data analytical technology rests on the aspect of data display substantially, too many increment information and intelligence suggestion can not be provided, enterprise still will paddle one's own canoe to carry out decision-making, from extracting data be worth ability too a little less than.
Enterprise is in the budding stage to the demand of the large data analysis solution of commercialization, and present situation is that enterprise self feels simply helpless to the mass data day by day adding up.Often both not known how to analyze also not know what the target of analyzing is.Under the overall background of national industry upgrading, all kinds of enterprises all, attempting innovation, provide high value-added product and service.How to utilize existing data to help in time, effectively, automatically and the decision-making of science becomes the embodiment of enterprise core competence day by day.Following enterprise will be more and more stronger to the dependence of data analysis, and this place, great market space of data analysis just.
Along with the generation of cloud concept, enterprise has had the ability to create the cloud platform of oneself at present, and the collection of large data has become possibility with storage, how cloud platform is applied to the urgent problem that becomes current research in the self-growth for enterprise.
Summary of the invention
Goal of the invention: the object of the invention is in order to solve the deficiencies in the prior art, adapt to the growth requirement of the data processing of accumulating over a long period, provide a kind of management flexibly, the high and quantity of information data cleansing accurately of efficiency with integrate intelligent system.
Technical scheme: data cleansing of the present invention with integrate intelligent system, its objective is such realization,
A kind of data cleansing and integration intelligent system, comprising:
Database Unit: according to the required structure Database Unit of industry, and set up index;
Cloud storage platform: collection data source and this data source have relevant information the construction logic relation of set membership, contrasting data library unit, revise and mate also the coupling of downward revision one by one with the superiors' information in this set membership, the data that match are carried out to algorithm for encryption storage;
Artificial intelligence data platform: the data that are stored in cloud storage platform are carried out to Data Audit, and in conjunction with terminal calling rule, audit recommendation is proposed, available data specification is become to be applicable to the form of CRM application, set up the CRM database of terminal according to calling rule, rear audit and normalized available data are migrated in the CRM database of terminal, for CRM application provides data basis;
Terminal: for terminal provides most suitable data fetching, guarantee that data integrity is written into CRM database, the data that grab in unit interval are carried out to data cleansing according to specification, guarantee data fit CRM using standard, be integrated into the form of daily form, according to terminal requirements, carry out data pick-up temporarily, according to terminal requirements, provide form as required.
The mainstream data source of ASCII text file, XML file, Excel form document is exported to SQL server, Oracle, Teradata by described cloud storage platform, and be transferred to cloud storage platform by the mode of Sterling File Gateway, FTP/SFTP/HTTPS.
Beneficial effect: data cleansing and integration platform are realized data acquisition and the distribution of sharing data center, provide the data exchange service such as warehouse-in are cleaned, change, loaded to exchanged information, clear up dirty data, the arrangement of complete paired data, guarantees data consistency, integrality and correctness.
Each operation system is carried out exchanges data and is shared by cleaning and integration system and shared data center platform, and each operation system independent operating, is independent of each other, and a certain operation system fault can not cause the impact on other system.
Embodiment
In order to deepen the understanding of the present invention, below in conjunction with embodiment, the invention will be further described, and this embodiment only, for explaining the present invention, does not form limiting the scope of the present invention.
A kind of data cleansing and integration intelligent system, comprising:
Database Unit: according to the required structure Database Unit of industry, and set up index;
Cloud storage platform: collection data source and this data source have relevant information the construction logic relation of set membership, contrasting data library unit, revise and mate also the coupling of downward revision one by one with the superiors' information in this set membership, the data that match are carried out to algorithm for encryption storage;
Artificial intelligence data platform: the data that are stored in cloud storage platform are carried out to Data Audit, and in conjunction with terminal calling rule, audit recommendation is proposed, available data specification is become to be applicable to the form of CRM application, set up the CRM database of terminal according to terminal calling rule, rear audit and normalized available data are migrated in the CRM database of terminal, for CRM application provides data basis;
Terminal: for terminal provides most suitable data fetching, guarantee that data integrity is written into CRM database, the data that grab in unit interval are carried out to data cleansing according to specification, guarantee data fit CRM using standard, be integrated into the form of daily form, according to terminal requirements, carry out data pick-up temporarily, according to terminal requirements, provide form as required.
Referring to Fig. 1, the structure flow process of system of the present invention is as follows:
The first step: building database, the certain industry information of collecting as required, builds Database Unit, and sets up index;
Second step: data analysis, collection data source and this data source have relevant information the construction logic relation of set membership, contrasting data library unit, revise and mate also the coupling of downward revision one by one with the superiors' information in this set membership, the data that match are carried out to algorithm for encryption storage;
The 3rd step: Data Audit, the data that are stored in cloud storage platform are carried out to Data Audit, and propose audit recommendation in conjunction with terminal calling rule, available data specification is become to be applicable to the form of CRM application;
The 4th step: Data Migration, set up the CRM database of terminal according to terminal calling rule, by after audit and normalized available data migrate in the CRM database of terminal, for CRM application provides data basis;
The 5th step: data capture, and for terminal provides most suitable data fetching, guarantee that data integrity is written into CRM database;
The 6th step: data cleansing, the data that grab in the unit interval are carried out to data cleansing according to specification, guarantee data fit CRM using standard, be integrated into the form of daily form;
The 7th step: data pick-up and form, according to terminal requirements, carry out data pick-up temporarily, according to terminal requirements, provide form as required.
The foregoing is only preferred embodiment of the present invention, in order to limit the present invention, within the spirit and principles in the present invention not all, any amendment of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.
Claims (2)
- Data cleansing with integrate an intelligent system, its spy is, described system comprises:Database Unit: according to the required structure Database Unit of industry, and set up index;Cloud storage platform: collection data source and this data source have relevant information the construction logic relation of set membership, contrasting data library unit, revise and mate also the coupling of downward revision one by one with the superiors' information in this set membership, the data that match are carried out to algorithm for encryption storage;Artificial intelligence data platform: the data that are stored in cloud storage platform are carried out to Data Audit, and in conjunction with terminal calling rule, audit recommendation is proposed, available data specification is become to be applicable to the form of CRM application, set up the CRM database of terminal according to calling rule, rear audit and normalized available data are migrated in the CRM database of terminal, for CRM application provides data basis;Terminal: for terminal provides most suitable data fetching, guarantee that data integrity is written into CRM database, the data that grab in unit interval are carried out to data cleansing according to specification, guarantee data fit CRM using standard, be integrated into the form of daily form, according to terminal requirements, carry out data pick-up temporarily, according to terminal requirements, provide form as required.
- 2. data cleansing according to claim 1 and integration intelligent system, it is characterized in that, the mainstream data source of ASCII text file, XML file, Excel form document is exported to SQL server, Oracle, Teradata by described cloud storage platform, and be transferred to cloud storage platform by the mode of Sterling File Gateway, FTP/SFTP/HTTPS.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410246840.6A CN104077359A (en) | 2014-06-05 | 2014-06-05 | Data cleaning and integrating intelligent system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410246840.6A CN104077359A (en) | 2014-06-05 | 2014-06-05 | Data cleaning and integrating intelligent system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104077359A true CN104077359A (en) | 2014-10-01 |
Family
ID=51598613
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410246840.6A Pending CN104077359A (en) | 2014-06-05 | 2014-06-05 | Data cleaning and integrating intelligent system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104077359A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104317624A (en) * | 2014-11-04 | 2015-01-28 | 南京联创科技集团股份有限公司 | Plug-in processing based data assembling method |
CN106933990A (en) * | 2017-02-21 | 2017-07-07 | 南京朴厚生态科技有限公司 | A kind of sensing data cleaning method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080162518A1 (en) * | 2007-01-03 | 2008-07-03 | International Business Machines Corporation | Data aggregation and grooming in multiple geo-locations |
CN101969475A (en) * | 2010-11-15 | 2011-02-09 | 张军 | Business data controllable distribution and fusion application system based on cloud computing |
CN102495885A (en) * | 2011-12-08 | 2012-06-13 | 中国信息安全测评中心 | Method for integrating information safety data based on base-networking engine |
CN103455636A (en) * | 2013-09-27 | 2013-12-18 | 浪潮齐鲁软件产业有限公司 | Automatic capturing and intelligent analyzing method based on Internet tax data |
-
2014
- 2014-06-05 CN CN201410246840.6A patent/CN104077359A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080162518A1 (en) * | 2007-01-03 | 2008-07-03 | International Business Machines Corporation | Data aggregation and grooming in multiple geo-locations |
CN101969475A (en) * | 2010-11-15 | 2011-02-09 | 张军 | Business data controllable distribution and fusion application system based on cloud computing |
CN102495885A (en) * | 2011-12-08 | 2012-06-13 | 中国信息安全测评中心 | Method for integrating information safety data based on base-networking engine |
CN103455636A (en) * | 2013-09-27 | 2013-12-18 | 浪潮齐鲁软件产业有限公司 | Automatic capturing and intelligent analyzing method based on Internet tax data |
Non-Patent Citations (1)
Title |
---|
代斌: "基于数据接口标准的数据采集分析技术", 《审计研究》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104317624A (en) * | 2014-11-04 | 2015-01-28 | 南京联创科技集团股份有限公司 | Plug-in processing based data assembling method |
CN104317624B (en) * | 2014-11-04 | 2017-06-06 | 南京联创科技集团股份有限公司 | Data assembly method based on plug-in unit treatment |
CN106933990A (en) * | 2017-02-21 | 2017-07-07 | 南京朴厚生态科技有限公司 | A kind of sensing data cleaning method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107145586B (en) | Label output method and device based on electric power marketing data | |
CN104820670B (en) | A kind of acquisition of power information big data and storage method | |
CN104462314B (en) | Power grid data processing method and device | |
CN104318481A (en) | Power-grid-operation-oriented holographic time scale measurement data extraction conversion method | |
CN102902752A (en) | Method and system for monitoring log | |
CN106709035A (en) | Preprocessing system for electric power multi-dimensional panoramic data | |
CN110750650A (en) | Construction method and device of enterprise knowledge graph | |
Nobre et al. | Assessing the Role of Big Data and the Internet of Things on the Transition to Circular Economy: Part II: An extension of the ReSOLVE framework proposal through a literature review | |
CN112182077B (en) | Intelligent operation and maintenance system based on data middling platform technology | |
CN102819589B (en) | ETL (Extract Transform Load)-based data optimization method and equipment | |
CN102750367A (en) | Big data checking system and method thereof on cloud platform | |
CN111090643B (en) | Mass electricity consumption data mining method based on data analysis system | |
CN106296458A (en) | Water utilities data processing method, device and water utilities data collecting system | |
SG10201702888XA (en) | Platform for the integration of operational bim, operational intelligence, and user journeys for the simplified and unified management of smart cities | |
CN102495916A (en) | Multi-application-system panoramic modeling method based on object matching | |
CN108287889B (en) | A kind of multi-source heterogeneous date storage method and system based on elastic table model | |
CN102999528A (en) | Method and device for ETL (Extract Transform and Load) task off-lining and data cleaning in data warehouse | |
CN106022640B (en) | Electric quantity index checking system and method | |
CN103984723A (en) | Method used for updating data mining for frequent item by incremental data | |
CN104077359A (en) | Data cleaning and integrating intelligent system | |
CN114218291A (en) | Portrait generation method, apparatus, device and storage medium based on target object | |
CN104361086A (en) | Data integration method for measurable asset entire life-cycle management system | |
CN112883001A (en) | Data processing method, device and medium based on marketing and distribution through data visualization platform | |
CN204790999U (en) | Big data acquisition of industry and processing system | |
CN116842092A (en) | Method and system for database construction and collection management |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20141001 |