GEOINFORMATICS

. A knowledge graph is becoming popular due to its ability to describe the real world by using a graph language that can be understood by both humans and machines using computer technologies. A case study to construct the knowledge graph of porphyry copper deposits is presented in this paper. First of all, the raw text data is collected and integrated from selected porphyry copper deposits and porphyry-skarn copper deposits in the Qinzhou Bay – Hangzhou Bay metallogenic belt, South China. Second, the text's entities, relations, and attributes are labeled and extracted with reference to the conceptual model of porphyry copper deposits in the study area. The third, a knowledge graph of porphyry copper deposits, was constructed using Neo4j 4.3. The resulted knowledge graph of porphyry copper deposit has the basic functions of an application. Furthermore, as part of a planned integrated knowledge graph from a single deposit, through an upper-geared metallogenic series, to a high-top metallogenic province, the understanding from the present study may be extended to mineral resource prospectivity and assessment beyond today. The interrelationship between the earth system, the metallogenic system, the exploration system, and the prospectivity and assessment (ES-MS-ES-PS) should be completely understood, and a knowledge graph system for ES-MS-ES-PS is needed. The key scientific and technological problems for achieving the ES-MS-ES-PS knowledge graph system are included in the progressively relative system of the domain ontology and knowledge graph of ES-MS-ES-PS, the automatic construction technology of complicated ES-MS-ES-PS domain ontology and knowledge graph, the self-evolution and complementary techniques for multi-modal correlation data embedding in the ES-MS-ES-PS knowledge graph, and the knowledge graph, big data mining and artificial intelligence based on ES-resource prospectivity, and assessment theory, and methods.


Introduction
In the present era of big data, data grows explosively, and it tends to be massive, heterogeneous, and loosely organized, bringing serious challenges to effective access to information and knowledge. The fundamental way out is to extend the human brain with the help of machine and machine learning, which urgently needs a language that people and machines can understand together [1][2][3].
The knowledge graph is one among these technologies. It is an integral part of artificial intelligence technology, known as interpretable artificial intelligence. With its powerful semantic processing ability and open organization ability, it provides an effective tool for knowledge organization and intelligent applications of information in the era of big data. Since the knowledge graph was formally proposed by Google in 2012, it has attracted much attention of researchers and has been widely used in intelligent search, intelligent Q & A, personalized recommendation, and so on [4][5][6][7][8][9][10][11][12] 1 .
This paper presents a case study to construct the knowledge graph, with a porphyry copper mine as a carrier. It introduces the construction algorithm of the geological deposit domain knowledge graph and discusses the extended idea of knowledge graph to Earth system -Metallogenic system -Exploration system -Prediction and evaluation system.

Methodology
The basis of the knowledge graph is a semantic network to reveal the relationship between entities [13]. It describes the real-world things and their relations in the way of "graph" and stores them in the database in the way of "entity-relationshipentity" triple. As a network, it consists of nodes and edges. A node represents an entity: all kinds of things, existence, and concepts in the real world, which can be either a concrete entity or an abstract concept, such as a known ore point or an abstract porphyry copper concept. Edge represents the relationship between entities, which is represented as attributes in many scenes, such as the location of ore point, rock mass, element content or mineralization time, and process of an ore occurrence. Figure 1 is the representation diagram of knowledge graph entity, attribute, and relationship.
The complete knowledge graph architecture includes Knowledge acquisition, Knowledge representation, Knowledge storage, Knowledge modeling, Knowledge fusion, Knowledge Computing, Knowledge operation, and maintenance, etc. It is included among the next key technologies and processes: Ontology modeling. The data model of the knowledge graph is established. In the ontology model, it is needed to construct the concept, attribute, and relationship of ontology. The process of ontology modeling is the basis of the knowledge graph. The high-quality data model can avoid many unnecessary and repetitive knowledge acquisitions, effectively improve the efficiency of knowledge graph and reduce the cost of domain data fusion.
Knowledge acquisition. In the real world, knowledge exists in structured, semi-structured, and unstructured data. Through knowledge extraction technology, different structures and types of data can be extracted into structured data that can be understood and calculated by computer. Knowledge acquisition is to extract knowledge from data of different sources and structures, form structured knowledge and store it in the knowledge graph. For text data, the extraction problems of knowledge acquisition include entity extraction, relationship extraction, attribute extraction, and event extraction.
Knowledge storage. The underlying storage method is designed to store all kinds of knowledge, so as to support the effective management and calculation of large-scale graph data. The objects of knowledge storage include  Knowledge fusion. Knowledge fusion aims at generating new knowledge and integrating the knowledge from loosely coupled sources to form a synthetic resource to supplement incomplete knowledge and acquire new knowledge. It is an interdisciplinary subject of knowledge organization and information fusion. Hidden or valuable new knowledge can be obtained, the structure and connotation of knowledge be optimized, and knowledge services be provided, through the acquisition, matching, integration, mining, and other processing methods of knowledge on many scattered and heterogeneous resources.
Knowledge operation and maintenance. It is necessary for the real scene to iterated or evolve and improve the full knowledge graph according to the application feedback, the emerging knowledge of the same type, and the new knowledge sources after the initial construction of the knowledge graph. In the process of knowledge operation and maintenance, it is needed to ensure that the quality of the knowledge graph can be well controlled and gradually enriched. The operation and maintenance process of a knowledge graph is an engineering system, covering the whole life cycle of knowledge graph from knowledge acquisition to knowledge computing.
Usually, three basic steps are needed in the construction of a knowledge graph: (1) Information extraction, which extracts entities, attributes, and relationships among entities from unstructured and semi-structured data sources. (2) Information fusion, which eliminates the ambiguity of concepts, eliminates redundant and wrong concepts and ensures the quality of knowledge.
(3) Knowledge processing, which includes quality evaluation or reasoning expansion of knowledge to obtain structured and networked knowledge system.
Structured data and text ones are the main sources of knowledge. The more commonly used tools for acquiring knowledge from structured databases are Triplify, D2RServer, OpenLink, SparqlMap, Ontop, etc. Knowledge graph visualization has Citespace, Protégé, Neo4j, and so on. Citespace is an information visualization software developed by using java language. Based on cocitation analysis theory and pathfinder algorithm, Citespace measures the literature (collections) in specific fields to find out the key path and knowledge inflection point of discipline evolution.

Knowledge graph of porphyry copper deposits
The main processes are involved in constructing the Knowledge graph of porphyry copper deposits as following: (1) Raw data acquisition. In the Qinzhou Bay -Hangzhou Bay metallogenic belt of South China, six porphyry copper deposits and porphyry skarn-type copper deposits are selected as the experimental objects. The Dexing copper deposit, Yongping copper deposit, Qibaoshan copper deposit, Baoshan copper deposit, Dabaoshan copper deposit, and Yuanzhuding copper deposit are included among them. The relevant geological and mineral survey and published academic papers are systematically collected to form the initial data. (2) The initial data acquisition, and the entity, relationship, and attribute annotation and extraction based on the conceptual model of porphyry copper deposit. The Xuanyuan data annotation system is used for data annotation. The annotation system is a general annotation system based on GUI, which allows the annotation file to be divided into multiple annotation tasks and allows multiple users to annotate and review. The system provides data annotation services, including batch storage and management of annotation files, auxiliary tools to simplify the difficulty of manual annotation, and machine annotation for specific fields.
The data extracted from the text is classified and standardized into three tuple formats with five columns: entity, entity type, relationship, attribute and attribute type. The entity is an existence of a deposit, an actual deposit, such as the Dexing porphyry copper deposit. The entity type is the type of deposit, such as porphyry copper deposit. Attribute is the attribute of deposit, which is used to describe the characteristics of deposit. Attribute type is the type of attribute. A section of the standardized CSV data is shown in Table. (3) Graph generation. Python is used to read data and write them into the Neo4j graph database. Create a new local database in neo4j, name the database, and then import the existing data in CSV format into the py2neo database to generate a knowledge graph (Fig. 2).
The knowledge graph resulted from this case has the basic application function of a normal knowledge graph. In Neo4j, Cypher statements can be used to query the whole database, specific label query, shortest path query, where predicate query, keyword query, relational query, attribute addition and deletion, label addition and deletion, etc.
Query a label, for example, that has a directed relationship with a node. Input: match (a:`Porphyry copper deposit`{name:'Dexing copper deposit'}) -(b) return a, b, the nodes connected with the Dexing Copper Mine are gotten (Fig. 3), from which the geological information and metallogenic conditions related to the formation of Dexing copper mine are demonstrated.

Prospect: Knowledge graph of ES-MS-ES-PS
The case above is part of the ongoing experiment to build knowledge graph series from single ore deposit, through metallogenic series, to metallogenic province, aiming at providing a demo for future large-scale construction and application of knowledge graph of ore deposits.
It is reasonable to build the knowledge graph of porphyry copper deposit for the first since its metallogenic model is classic and well recognized by almost all geologists. It is also the main theoretical model for prospecting for porphyry copper deposits. The workload of building an ontology model is relatively controllable. Based on the existing geological survey reports and other unstructured and semi-structured data, through ontology construction, knowledge extraction, knowledge disambiguation, and knowledge fusion, the knowledge graph of porphyry copper deposit may well be constructed.Similarly, the knowledge graph of epithermal metallogenic system (Fig. 4) and Qinzhou Bay -Hangzhou Bay metallogenic belt (Fig. 5)  The single deposit belongs to the metallogenic series and the important metallogenic area (belt), and its attributes are inherited. The metallogenic series and the important metallogenic area (belt) are intersecting, and their attribute relationship is complex. The construction of a knowledge graph system of individual deposits, metallogenic series, and important metallogenic areas (belts) can provide valuable support for the construction of a larger knowledge graph of Earth system -Metallogenic system -Exploration system -Prediction and evaluation system. The prediction and evaluation of mineral resources is one of the important directions in the application of geological science and has formed a unique theory and method system [17,18]. But generally speaking, the existing metallogenic prediction theories and methods are mainly composed of two parts. The first is the mineral prediction model, which is the metallogenic prediction elements and criteria established by summarizing the metallogenic law of typical deposits and geophysical, geochemical, and remote sensing anomaly characteristics. The second is the mathematical model of prospecting information extraction and fusion, that is, the mathematical model is used to quantify and fuse the corresponding prediction elements in the prediction model, so as to finally estimate the size of metallogenic potential. Most of the research focuses on the mathematical model of prospecting information extraction and fusion. The research on the mineral prediction model mainly depends on the knowledge www.nznj.ru

Fig. 3. Directed relation nodes of the Dexing porphyry copper deposit Different colors represent different types of geological properties Рис. 3. Узлы направленной связи медно-порфирового месторождения Дексинг
Разные цвета представляют разные типы геологических свойств drive of geological experts, i.e., a quantitative mineral prediction model is formed based on the main metallogenic geological characteristics and prospecting indicators (including geology, geophysics, geochemistry, and remote sensing) of several typical deposits.The traditional approach is flawed. First of all, its starting point is based on the characteristics of metallogenic geology and metallogenic law, that is, the characteristics of metallogenic system itself, without considering the correlation between metallogenic system and other earth systems such as disaster system and climate system, which may lead to the omission of some important prediction elements and the incompleteness of prediction model. Secondly, over reliance on the knowledge driving of geological experts reduces the effectiveness of the prediction model.
The future mineral resources prediction and evaluation should fully understand the relationship among the Earth system, the metallogenic system, the exploration system and the prediction and evaluation system (ES-MS-ES-PS). It is an important development direction to establish the associated knowledge graph system of "ES-MS-ES-PS".
The key scientific and technical problems following need to be solved in order to establish the knowledge graph system of 'ES-MS-ES-PS': (1) Progressive correlation system of the ES-MS-ES-PS knowledge graphs. The knowledge graph progressive correlation system of the Earth system -Metallogenic system -Exploration system -Prediction and evaluation system is not well understood yet. Behind them are the Earth, Metallogenic, Exploration, and Mining Sciences,

Fig. 4. The visual interface of knowledge graph of epithermal metallogenic system Рис. 4. Визуальный интерфейс графа знаний эпитермальной металлогенической системы
which are both systematic and intricate. This limits the integration of data and knowledge, and also the exploration of the potential of the system. Based on the system association framework of the knowledge graph, the interpretable prediction and evaluation of mineral resources can be formed through the digestion and fusion of knowledge co-index and the community detection and correlation based on graph theory.
The ES-MS-ES-PS can be regarded as selfcontained but interrelated systems. Logically, the earth system includes the metallogenic system, which inherits the attributes and relations of the earth system. The earth system has a larger extension, and the metallogenic system has a more specific connotation. The exploration system and the prediction and evaluation system are the current expert knowledge systems. They are not completely coincident with the actual metallogenic system, and there is an intersection between them.
(2) The ontology construction of ES-MS-ES-PS. Geological big data are considered as the main research object. Firstly, a machine learning algorithm is used to model and associate the knowledge of the Earth system, metallogenic system, exploration system, and prediction and evaluation system under the guidance of the ontology model of ES-MS-ES-PS. Speech tagging has done for text data. Then, the candidate entity www.nznj.ru Through machine learning, semantic analysis, visual analysis, and other intelligent methods, the ES-MS-ES-PS are analyzed. The in-depth development of knowledge graphs in the field of exploration systems and prediction and evaluation systems provides multi-source, multi-dimensional, spatiotemporal, multi-scale information and knowledge intelligent services for mineral resource prediction and evaluation, improves the breadth, accuracy, and efficiency of deep-sitting prospecting information identification and extraction, and links and integrates prospecting information in the fields of ES-MS-ES-PS. All above will lead to the occurrence of the smart prediction of mineral resources based on ES-MS-ES-P.
(3) Automatic extraction technology of largescale geological knowledge graph relationship. In the process of automatic acquisition of geological knowledge and construction of knowledge graph, relation extraction is the core and the only way to accomplish this task. The purpose of relation extraction is to extract the relationship between entities from unlabeled self-owned texts, and then structure the entity and relationship into structured knowledge, and extend it into a knowledge graph accordingly. The traditional relational extraction method is based on the construction of a supervised extraction system, and its training and deployment rely heavily on largescale manually labeled data, which consumes huge time and manpower. This project develops and constructs a remote supervised relation extraction system to make up for the problems existing in the traditional supervised model. At the same time, it explores the introduction of multisource external information to eliminate the noise problem in remote supervision and alleviate the impact of long-tail data, so as to obtain a more robust geological knowledge extraction system.
(4) Evolving and improving itself of knowledge graph embedding multi-modal association data. Heterogeneity is an unneglectable problem to construct an opening geological knowledge graph. The traditional way to solve ontology heterogeneity is ontology integration. Ontology integration directly merges multiple ontologies into a large ontology, and each heterogeneous system uses the unified ontology. In this way, the interaction between them can be carried out directly, thus solving the problem of ontology heterogeneity. However, the integration of ontology is timeconsuming and laborious and lacks automatic method support. With the change of multiple ontologies, the integration process needs to be repeated and the cost is too high. In addition, the integrated ontology is not universal and flexible for different applications. Therefore, ontology integration is not suitable to solve the distributed and dynamic multi ontology application problems in the knowledge graph. In fact, most applications only need to realize the interoperability between ontologies to meet the requirements, and complete integration is not necessary. This project studies the ontology mapping method based on multi-modal association data embedding. It achieves ontology interoperability by establishing mapping rules between ontologies. At the same time, it introduces a large number of texts, images, and numerical information in the knowledge base, improves the quality of mapping and matching, and realizes the effective completion of the knowledge graph.
(5) Data acquisition, access, and fusion mechanism based on the knowledge graph. Community structure is popular in the geological knowledge graph. Community refers to a group of nodes that are closely related to each other within the community, and their relationship with nodes outside the community is relatively loose. It has many applications to obtain and query community data, identify community structure, analyze the structure and function of the whole network, and predict the interaction between various elements of the network, such as geological network analysis, identification of special geological phenomena, deposit prediction, etc. Traditional community detection only considers the structural features with neglecting the necessary semantic information on the knowledge graph. This project will study the community detection algorithm for knowledge graphs, and introduce attribute-based retrieval, which can effectively improve computational efficiency.
(6) The construction norms and standard system of the geoscience knowledge graph. The standardization of geological knowledge graph is greatly important to improve construction efficiency, ensure data re-use in multiple fields, and give full play to knowledge graph analysis and technical value. This project studies the overall framework of the geological knowledge graph, mainly focusing on knowledge acquisition, knowledge representation, knowledge modeling, knowledge fusion, knowledge storage, knowledge computing, knowledge operation and maintenance, natural language processing, and other related supporting technology fusion, covering the whole life cycle of the knowledge graph, providing guarantee for technology development and application.

Conclusions
It may be concluded through the analysis above that: www.nznj.ru (1) Knowledge graph represents the objects and their relationships in the objective world with the mathematical model of the graph, which makes knowledge and data easier to exchange, circulate, and process between computers and between computers and people. Compared with a traditional relational database, a knowledge graph is more flexible and more suitable for a big data environment. In the era of big data and artificial intelligence, there is an urgent need for a language that people and machines can understand together to extend the human brain.
(2) The construction of the knowledge graph of porphyry copper deposits is a good experiment, it may be well extended to the epithermal metallogenic system and the Qinzhou Bay -Hangzhou Bay, metallogenic belt, South China, resulting in a complete knowledge graph system from the single deposit, through metallogenic series, to an important metallogenic area (belt). Then a greater knowledge graph system of Earth system -Metallogenic system -Exploration system -Prediction and evaluation system may be expected.
(3) The future mineral resource prediction and evaluation should fully understand the relationship among the Earth system, the metallogenic system, the exploration system, and the prediction and evaluation system. A more universal metallogenic prediction system may be established through open integration and deep mining of different systems or geological big data.
The transformation of quantitative prediction and evaluation of mineral resources may be promoted by the establishment of the associated knowledge graph system of the Earth system -Metallogenic system -Exploration system -Prediction and evaluation system.