Projects per year
Valuable technical information are buried in the under-utilised, user-generated technical texts in engineering domains, such as manufacturing, logistics and maintenance. For maintenance and reliability personnel, the unstructured technical text in maintenance work orders (MWO) hold crucial information about failures and work performed on physical assets. However, the domain-specific language used and scarcity of shared labelled data sets in these contexts present formidable challenges to contemporary natural language processing (NLP) techniques, resulting in inability to achieve performance similar to those in non-engineering domains. In this work, we explore the structure of language in technical short texts by learning a context-free grammar (CFG) through unsupervised grammar induction on industrial MWO texts. We exploit the grammar’s generative properties for novel sentence generation and corpus construction and assess its viability for developing synthetic MWO data sets. The results demonstrate a) there exists a grammar in the MWOs, b) the grammar was able to model aspects of the maintenance technical language to produce 12k of synthetic MWO texts 93% as natural and 87% as correct as real texts, and c) the domain-specific language used in technical short text remains challenging to parse due to low data quality and sparsity. Contributions of this work include baseline results for a grammar-based synthetic technical text generation and an appreciation for challenges in assessing the engineering correctness and naturalness of the new synthetic texts.
|Title of host publication||AI 2022|
|Subtitle of host publication||Advances in Artificial Intelligence - 35th Australasian Joint Conference, AI 2022, Proceedings|
|Editors||Haris Aziz, Débora Corrêa, Tim French|
|Publisher||Springer Science + Business Media|
|Number of pages||14|
|Publication status||Published - 2022|
|Event||35th Australasian Joint Conference on Artificial Intelligence, AI 2022 - Perth, Australia|
Duration: 5 Dec 2022 → 9 Dec 2022
|Name||Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)|
|Conference||35th Australasian Joint Conference on Artificial Intelligence, AI 2022|
|Period||5/12/22 → 9/12/22|
FingerprintDive into the research topics of 'Using Context-Free Grammar to Generate Synthetic Technical Short Texts'. Together they form a unique fingerprint.
- 1 Active
ARC Training Centre for Transforming Maintenance through Data Science
Rohl, A., Small, M., Hodkiewicz, M., Loxton, R., O'Halloran, K., Tan, T., Calo, V., Reynolds, M., Liu, W., While, R., French, T., Cripps, E. & Cardell-Oliver, R.
1/01/19 → 31/12/23