CN105183713A - English composition automatic correcting method and system - Google Patents

English composition automatic correcting method and system Download PDF

Info

Publication number
CN105183713A
CN105183713A CN201510536642.8A CN201510536642A CN105183713A CN 105183713 A CN105183713 A CN 105183713A CN 201510536642 A CN201510536642 A CN 201510536642A CN 105183713 A CN105183713 A CN 105183713A
Authority
CN
China
Prior art keywords
average
multiple text
text feature
points
score
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510536642.8A
Other languages
Chinese (zh)
Inventor
唐聪
宋文略
杨晓昊
许轶
肖迪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Focusedu International Education Consultation Co Ltd
Original Assignee
Beijing Focusedu International Education Consultation Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Focusedu International Education Consultation Co Ltd filed Critical Beijing Focusedu International Education Consultation Co Ltd
Priority to CN201510536642.8A priority Critical patent/CN105183713A/en
Publication of CN105183713A publication Critical patent/CN105183713A/en
Pending legal-status Critical Current

Links

Abstract

The present invention provides an English composition automatic correcting method and system. The method comprises: extracting a plurality of text features of a to-be-corrected composition; scoring the plurality of text features by a modified preset scoring rule; and acquiring a first average score of scores of the plurality of text features, and using the first average score as a score of the to-be-corrected composition. By using a statistic and rules fusion two-stage method, the method not only alleviates the problem of a large requirement for a corpus data volume in a statistic method, but also helps to solve the problems of comprehensiveness and accuracy in the process of setting a rule; meanwhile, a weight parameter of each text feature is determined by using a score deviation correcting method so as to enable the weight parameter to be more accurate. The volume of acquired and labeled data is reduced, and time and manpower are saved. Furthermore, results of two technical schemes based on statistic and based on the rule are greater in robustness.

Description

A kind of english composition automatically correct method and system
Technical field
The present invention relates to teaching work and correct technical field, what particularly relate to a kind of english composition corrects method and system automatically.
Background technology
English composition corrects the work load that not only can alleviate teacher automatically, and the marking that student resource can also be allowed to carry out writing a composition and amendment, improve writing ability and the skill of student efficiently, accurately.
A kind of method of correcting automatically of english composition is had to be carry out statistical study by a large amount of language materials at present, by calculating the distance between theme and standard corpus storehouse, the score of in-time generatin theme and content analysis.Carry out comprehensive grading by vocabulary, sentence, the structure of an article, the relevant four large features of content, and sentence by sentence analysis is commented on to composition.But said method too relies on language material, be difficult to realization automatically correct when template language material deficiency, and when language material deficiency, the accuracy rate of correcting is very low, and a large amount of template language materials is difficult to obtain, the composition therefore adopting said method to be difficult to realize batch is corrected automatically.
Summary of the invention
For defect of the prior art, what the invention provides a kind of english composition corrects method and system automatically, and the composition achieving batch is corrected automatically.
First aspect, what the invention provides a kind of english composition corrects method automatically, comprising:
Extract multiple text features of composition to be changed;
By revised default code of points, described multiple text feature is marked;
Obtain the first average of the score of described multiple text feature, and using the score of described first average as composition to be changed.
Optionally, multiple text features of described extraction composition to be changed, comprising:
Extract the word feature of composition to be changed, sentence characteristics, paragraph structure characteristic sum topic sentence semantic feature.
Optionally, before being marked to described multiple text feature by revised default code of points, described method also comprises:
Obtain revised default code of points.
Optionally, the revised default code of points of described acquisition comprises:
By default code of points, described multiple text feature is marked, and obtain the second average of the score of described multiple text feature according to the score of described multiple text feature and default weighted value corresponding to multiple text feature;
By the corpus data preset with mark, described multiple text feature is marked, obtain the 3rd average of the score of described multiple text feature;
More described second average and described 3rd average, according to comparative result, determine whether the default weighted value revising described default code of points.
Optionally, described according to comparative result, determine whether the default weighted value revising described default code of points, comprising:
More described second average and described 3rd average, obtain the absolute value of the difference of described second average and described 3rd average;
If the absolute value of described difference is less than or equal to preset difference value, then do not need the default weighted value revising described default code of points;
Or
If the absolute value of described difference is greater than preset difference value, then revise the default weighted value of described preset rules, and again obtained the 4th average of the score of described multiple text feature by revised default weighted value, until the absolute value of the difference of described 4th average mark and described 3rd average mark is less than or equal to preset difference value.
Second aspect, what present invention also offers a kind of english composition corrects system automatically, comprising:
Extraction module, for extracting multiple text features of composition to be changed;
Grading module, for marking to described multiple text feature by revised default code of points;
First acquisition module, for obtaining the first average of the score of described multiple text feature, and using the score of described first average as composition to be changed.
Optionally, described extraction module, for:
Extract the word feature of composition to be changed, sentence characteristics, paragraph structure characteristic sum topic sentence semantic feature.
Optionally, described system also comprises: the second acquisition module, for before being marked to described multiple text feature by revised default code of points, obtains revised default code of points.
Optionally, described second acquisition module, for:
By default code of points, described multiple text feature is marked, and obtain the second average of the score of described multiple text feature according to the score of described multiple text feature and default weighted value corresponding to multiple text feature;
By the corpus data preset with mark, described multiple text feature is marked, obtain the 3rd average of the score of described multiple text feature;
More described second average and described 3rd average, according to comparative result, determine whether the default weighted value revising described default code of points.
Optionally, described second acquisition module, for:
More described second average and described 3rd average, obtain the absolute value of the difference of described second average and described 3rd average;
If the absolute value of described difference is less than or equal to preset difference value, then do not need the default weighted value revising described default code of points;
Or
If the absolute value of described difference is greater than preset difference value, then revise the default weighted value of described preset rules, and again obtained the 4th average of the score of described multiple text feature by revised default weighted value, until the absolute value of the difference of described 4th average mark and described 3rd average mark is less than or equal to preset difference value.
As shown from the above technical solution, what the invention provides a kind of english composition corrects method and system automatically, the method does not need too much corpus data, alleviate and large problem is required to corpus data amount, by revised default code of points, multiple feature is marked simultaneously, the appraisal result finally obtained is comparatively accurate, and the composition that can achieve batch is corrected automatically.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these figure.
Fig. 1 is the schematic flow sheet of the method for automatically correcting of a kind of english composition that one embodiment of the invention provides;
Fig. 2 is the schematic flow sheet of the method for automatically correcting of a kind of english composition that another embodiment of the present invention provides;
Fig. 3 is the structural representation of the system of automatically correcting of a kind of english composition that one embodiment of the invention provides.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
Fig. 1 shows the schematic flow sheet of the method for automatically correcting of a kind of english composition that one embodiment of the invention provides, and as shown in Figure 1, the method comprises the following steps:
101, multiple text features of composition to be changed are extracted;
102, by revised default code of points, described multiple text feature is marked;
103, the first average of the score of described multiple text feature is obtained, and using the score of described first average as composition to be changed.
Said method does not need too much corpus data, alleviate and large problem is required to corpus data amount, marked to multiple feature by revised default code of points, the appraisal result finally obtained is comparatively accurate, and the composition that can achieve batch is corrected automatically simultaneously.
Will be understood that, in above-mentioned steps 101, extract multiple text features of composition to be changed.The plurality of text feature specifically comprises the semantic feature of word feature, sentence characteristics, paragraph structure feature and topic sentence.Word feature comprises character length and the vocabulary grade of word, and sentence characteristics comprises character length and the clause complexity of sentence, and paragraph structure feature refers to the layout of each section of character length, and the semantic feature of topic sentence refers to the character match degree of topic sentence and composition title.
Before above-mentioned steps 102 to be marked to described multiple text feature by revised default code of points, described method also comprises unshowned step in Fig. 1:
Obtain revised default code of points.
Concrete, the english composition method of automatically correcting that the present invention proposes mainly is divided into two stages, and the first stage is the code of points (GRE, TOEFL etc.) according to english composition, extracts feature, the technical logic of design marking, as basic scoring system; Subordinate phase has the language material of mark by gathering, adjust the weight parameter of feature, thus revises the marking result of first stage, and the revised default code of points of above-mentioned acquisition is mainly divided into two stages, as shown in Figure 2:
First stage, rule-based marking, by default code of points, described multiple text feature is marked, and obtain the second average of the score of described multiple text feature according to the score of described multiple text feature and default weighted value corresponding to multiple text feature.
Namely according to code of points algorithm for design logic.Different english composition application scenarioss has different code of points (GRE, TOEFL etc.), according to each mark section in code of points to the description of feature, the logic of design bonus point or deduction, and then obtaining the score of each feature, score averages is first stage score (the second average).
Subordinate phase, Corpus--based Method is given a mark, and is marked, obtain the 3rd average of the score of described multiple text feature by the corpus data preset with mark to described multiple text feature.
Namely gather the corpus data having mark, data comprise the artificial marking mark (raw score) of composition and correspondence, and corpus data needs to cover each mark section.By the corpus data with mark, the multiple text features extracting composition to be changed are marked, obtain the 3rd average.
By above two stages, more described second average and described 3rd average, according to comparative result, determine whether the default weighted value revising described default code of points.
Concrete, more described second average and described 3rd average, obtain the absolute value of the difference of described second average and described 3rd average;
If the absolute value of described difference is less than or equal to preset difference value, then do not need the default weighted value revising described default code of points;
In another attainable mode, if the absolute value of described difference is greater than preset difference value, then revise the default weighted value of described preset rules, and again obtained the 4th average of the score of described multiple text feature by revised default weighted value, until the absolute value of the difference of described 4th average mark and described 3rd average mark is less than or equal to preset difference value.
Revised default code of points is obtained by said method, and then treat to correct compositions by revised default code of points and mark, result is more accurate, have employed the dual stage process that statistics Sum fanction merges, not only alleviate in statistical method and large problem is required to corpus data amount, and help the problem solving the comprehensive and accuracy faced that lays down a regulation, the determination of the weight parameter of each text feature simultaneously have employed the method utilizing mark difference to revise, and makes weight parameter more accurate.
Said method can correct compositions in batches, and decrease the amount of collection and labeled data, save time and manpower, in addition Corpus--based Method and rule-based two large technical schemes, result has more robustness.
Fig. 3 shows the structural representation of the system of automatically correcting of a kind of english composition that one embodiment of the invention provides, and as shown in Figure 3, this system comprises:
Extraction module 31, for extracting multiple text features of composition to be changed;
Grading module 32, for marking to described multiple text feature by revised default code of points;
First acquisition module 33, for obtaining the first average of the score of described multiple text feature, and using the score of described first average as composition to be changed.
One of the present embodiment preferred embodiment in, described extraction module 31, for:
Extract the word feature of composition to be changed, sentence characteristics, paragraph structure characteristic sum topic sentence semantic feature.
One of the present embodiment preferred embodiment in, described system also comprises in Fig. 3 unshowned: the second acquisition module 34, for before being marked to described multiple text feature by revised default code of points, obtain revised default code of points.
One of the present embodiment preferred embodiment in, described second acquisition module 34, for:
By default code of points, described multiple text feature is marked, and obtain the second average of the score of described multiple text feature according to the score of described multiple text feature and default weighted value corresponding to multiple text feature;
By the corpus data preset with mark, described multiple text feature is marked, obtain the 3rd average of the score of described multiple text feature;
More described second average and described 3rd average, according to comparative result, determine whether the default weighted value revising described default code of points.
One of the present embodiment preferred embodiment in, described second acquisition module 34, for:
More described second average and described 3rd average, obtain the absolute value of the difference of described second average and described 3rd average;
If the absolute value of described difference is less than or equal to preset difference value, then do not need the default weighted value revising described default code of points;
Or
If the absolute value of described difference is greater than preset difference value, then revise the default weighted value of described preset rules, and again obtained the 4th average of the score of described multiple text feature by revised default weighted value, until the absolute value of the difference of described 4th average mark and described 3rd average mark is less than or equal to preset difference value.
Said system and said method are relations one to one, and the implementation detail of said method is equally applicable to this system, and therefore the present embodiment is no longer described in detail to the concrete implementation detail of system.
Above embodiment only in order to technical scheme of the present invention to be described, is not intended to limit; Although with reference to previous embodiment to invention has been detailed description, those of ordinary skill in the art is to be understood that; It still can be modified to the technical scheme described in foregoing embodiments, or carries out equivalent replacement to wherein portion of techniques feature; And these amendments or replacement, do not make the essence of appropriate technical solution depart from the spirit and scope of various embodiments of the present invention technical scheme.

Claims (10)

1. english composition automatically correct a method, it is characterized in that, comprising:
Extract multiple text features of composition to be changed;
By revised default code of points, described multiple text feature is marked;
Obtain the first average of the score of described multiple text feature, and using the score of described first average as composition to be changed.
2. method according to claim 1, is characterized in that, multiple text features of described extraction composition to be changed, comprising:
Extract the word feature of composition to be changed, sentence characteristics, paragraph structure characteristic sum topic sentence semantic feature.
3. method according to claim 1, is characterized in that, before being marked to described multiple text feature by revised default code of points, described method also comprises:
Obtain revised default code of points.
4. method according to claim 3, is characterized in that, the revised default code of points of described acquisition comprises:
By default code of points, described multiple text feature is marked, and obtain the second average of the score of described multiple text feature according to the score of described multiple text feature and default weighted value corresponding to multiple text feature;
By the corpus data preset with mark, described multiple text feature is marked, obtain the 3rd average of the score of described multiple text feature;
More described second average and described 3rd average, according to comparative result, determine whether the default weighted value revising described default code of points.
5. method according to claim 4, is characterized in that, described according to comparative result, determines whether the default weighted value revising described default code of points, comprising:
More described second average and described 3rd average, obtain the absolute value of the difference of described second average and described 3rd average;
If the absolute value of described difference is less than or equal to preset difference value, then do not need the default weighted value revising described default code of points;
Or
If the absolute value of described difference is greater than preset difference value, then revise the default weighted value of described preset rules, and again obtained the 4th average of the score of described multiple text feature by revised default weighted value, until the absolute value of the difference of described 4th average mark and described 3rd average mark is less than or equal to preset difference value.
6. english composition automatically correct a system, it is characterized in that, comprising:
Extraction module, for extracting multiple text features of composition to be changed;
Grading module, for marking to described multiple text feature by revised default code of points;
First acquisition module, for obtaining the first average of the score of described multiple text feature, and using the score of described first average as composition to be changed.
7. system according to claim 6, is characterized in that, described extraction module, for:
Extract the word feature of composition to be changed, sentence characteristics, paragraph structure characteristic sum topic sentence semantic feature.
8. system according to claim 6, is characterized in that, described system also comprises: the second acquisition module, for before being marked to described multiple text feature by revised default code of points, obtains revised default code of points.
9. system according to claim 8, is characterized in that, described second acquisition module, for:
By default code of points, described multiple text feature is marked, and obtain the second average of the score of described multiple text feature according to the score of described multiple text feature and default weighted value corresponding to multiple text feature;
By the corpus data preset with mark, described multiple text feature is marked, obtain the 3rd average of the score of described multiple text feature;
More described second average and described 3rd average, according to comparative result, determine whether the default weighted value revising described default code of points.
10. system according to claim 9, is characterized in that, described second acquisition module, for:
More described second average and described 3rd average, obtain the absolute value of the difference of described second average and described 3rd average;
If the absolute value of described difference is less than or equal to preset difference value, then do not need the default weighted value revising described default code of points;
Or
If the absolute value of described difference is greater than preset difference value, then revise the default weighted value of described preset rules, and again obtained the 4th average of the score of described multiple text feature by revised default weighted value, until the absolute value of the difference of described 4th average mark and described 3rd average mark is less than or equal to preset difference value.
CN201510536642.8A 2015-08-27 2015-08-27 English composition automatic correcting method and system Pending CN105183713A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510536642.8A CN105183713A (en) 2015-08-27 2015-08-27 English composition automatic correcting method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510536642.8A CN105183713A (en) 2015-08-27 2015-08-27 English composition automatic correcting method and system

Publications (1)

Publication Number Publication Date
CN105183713A true CN105183713A (en) 2015-12-23

Family

ID=54905802

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510536642.8A Pending CN105183713A (en) 2015-08-27 2015-08-27 English composition automatic correcting method and system

Country Status (1)

Country Link
CN (1) CN105183713A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280065A (en) * 2017-01-05 2018-07-13 广州讯飞易听说网络科技有限公司 A kind of foreign language text evaluation method and device
CN108595410A (en) * 2018-03-19 2018-09-28 小船出海教育科技(北京)有限公司 The automatic of hand-written composition corrects method and device
CN109285404A (en) * 2018-10-25 2019-01-29 安徽创见未来教育科技有限公司 A kind of English composition automatic scoring system
CN110390032A (en) * 2019-07-26 2019-10-29 江苏曲速教育科技有限公司 Method and system are read and made comments in a kind of hand-written composition
CN111460799A (en) * 2020-02-24 2020-07-28 云知声智能科技股份有限公司 English grammar correction method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6115683A (en) * 1997-03-31 2000-09-05 Educational Testing Service Automatic essay scoring system using content-based techniques
CN1700200A (en) * 2005-05-30 2005-11-23 梁茂成 English composition automatic scoring system
CN102279844A (en) * 2011-08-31 2011-12-14 中国科学院自动化研究所 Method and system for automatically testing Chinese composition
CN103294660A (en) * 2012-02-29 2013-09-11 张跃 Automatic English composition scoring method and system
CN104778160A (en) * 2015-04-27 2015-07-15 桂林电子科技大学 Analysis method for subject relevance of English composition contents

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6115683A (en) * 1997-03-31 2000-09-05 Educational Testing Service Automatic essay scoring system using content-based techniques
CN1700200A (en) * 2005-05-30 2005-11-23 梁茂成 English composition automatic scoring system
CN102279844A (en) * 2011-08-31 2011-12-14 中国科学院自动化研究所 Method and system for automatically testing Chinese composition
CN103294660A (en) * 2012-02-29 2013-09-11 张跃 Automatic English composition scoring method and system
CN104778160A (en) * 2015-04-27 2015-07-15 桂林电子科技大学 Analysis method for subject relevance of English composition contents

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
范涛松: "基于神经网络的英语作文自动评分模型研究与实现", 《万方数据库》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108280065A (en) * 2017-01-05 2018-07-13 广州讯飞易听说网络科技有限公司 A kind of foreign language text evaluation method and device
CN108280065B (en) * 2017-01-05 2021-12-14 广州讯飞易听说网络科技有限公司 Foreign text evaluation method and device
CN108595410A (en) * 2018-03-19 2018-09-28 小船出海教育科技(北京)有限公司 The automatic of hand-written composition corrects method and device
CN109285404A (en) * 2018-10-25 2019-01-29 安徽创见未来教育科技有限公司 A kind of English composition automatic scoring system
CN110390032A (en) * 2019-07-26 2019-10-29 江苏曲速教育科技有限公司 Method and system are read and made comments in a kind of hand-written composition
CN110390032B (en) * 2019-07-26 2021-08-17 江苏曲速教育科技有限公司 Method and system for reading handwritten composition
CN111460799A (en) * 2020-02-24 2020-07-28 云知声智能科技股份有限公司 English grammar correction method and device
CN111460799B (en) * 2020-02-24 2023-10-20 云知声智能科技股份有限公司 English grammar correcting method and device

Similar Documents

Publication Publication Date Title
CN105183713A (en) English composition automatic correcting method and system
CN107608963B (en) Chinese error correction method, device and equipment based on mutual information and storage medium
KR101877693B1 (en) Intelligent scoring method and system for text objective question
CN103020022B (en) A kind of Chinese unknown word identification system and method based on improving Information Entropy Features
CN105045778B (en) A kind of Chinese homonym mistake auto-collation
CN103294660B (en) A kind of english composition automatic scoring method and system
CN104991889B (en) A kind of non-multi-character word error auto-collation based on fuzzy participle
CN104142912A (en) Accurate corpus category marking method and device
CN104750672A (en) Chinese word error correction method used in search and device thereof
CN110362820B (en) Bi-LSTM algorithm-based method for extracting bilingual parallel sentences in old and Chinese
CN104360996A (en) Sentence alignment method of bilingual text
CN104933023A (en) Chinese address word segmentation and annotation method
CN105374248B (en) A kind of methods, devices and systems for correcting pronunciation
CN109213856A (en) A kind of method for recognizing semantics and system
CN108664474A (en) A kind of resume analytic method based on deep learning
CN103970765A (en) Error correcting model training method and device, and text correcting method and device
CN105068990B (en) A kind of English long sentence dividing method of more strategies of Machine oriented translation
CN103500216B (en) Method for extracting file information
KR20160034678A (en) Apparatus for grammatical error correction and method using the same
CN105608074B (en) A kind of word counting method and device
CN103608805A (en) Dictionary generation device, method, and program
CN104239292B (en) A kind of method for obtaining specialized vocabulary translation
CN104463157A (en) Electronic identification method for handwritten characters
CN104636431A (en) Automatic extraction and optimizing method for document abstracts of different fields
CN109062888A (en) A kind of self-picketing correction method when there is Error Text input

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned

Effective date of abandoning: 20190409

AD01 Patent right deemed abandoned