Текущий выпуск Номер 3, 2025 Том 17

Все выпуски

2025 Том 17
2024 Том 16
- Номер 7 (специальный выпуск)
- Номер 6
- Номер 5
- Номер 4
- Номер 3
- Номер 2
- Номер 1 (специальный выпуск)
2023 Том 15
- Номер 6
- Номер 5
- Номер 4 (специальный выпуск)
- Номер 3
- Номер 2 (специальный выпуск)
- Номер 1
2022 Том 14
- Номер 6
- Номер 5
- Номер 4 (специальный выпуск)
- Номер 3
- Номер 2 (специальный выпуск)
- Номер 1
2021 Том 13
- Номер 6
- Номер 5
- Номер 4
- Номер 3
- Номер 2 (специальный выпуск)
- Номер 1
2020 Том 12
2019 Том 11
2018 Том 10
- Номер 6
- Номер 5 (специальный выпуск)
- Номер 4
- Номер 3 (специальный выпуск)
- Номер 2
- Номер 1
2017 Том 9
2016 Том 8
2015 Том 7
- Номер 6
- Номер 5
- Номер 4
- Номер 3 (специальный выпуск)
- Номер 2
- Номер 1
2014 Том 6
- Номер 6 (специальный выпуск)
- Номер 5
- Номер 4
- Номер 3
- Номер 2
- Номер 1
2013 Том 5
- Номер 6 (специальный выпуск)
- Номер 5
- Номер 4
- Номер 3
- Номер 2
- Номер 1
2012 Том 4
2011 Том 3
2010 Том 2
2009 Том 1

Результаты поиска по 'labeling':

Найдено статей: 12

От редакции
Компьютерные исследования и моделирование, 2024, т. 16, № 7, с. 1533-1538

Editor’s note
Computer Research and Modeling, 2024, v. 16, no. 7, pp. 1533-1538
Бергер А.И., Гуда С.А.
Свойства алгоритмов поиска оптимальных порогов для задач многозначной классификации
Компьютерные исследования и моделирование, 2022, т. 14, № 6, с. 1221-1238

Модели многозначной классификации возникают в различных сферах современной жизни, что объясняется всё большим количеством информации, требующей оперативного анализа. Одним из математических методов решения этой задачи является модульный метод, на первом этапе которого для каждого класса строится некоторая ранжирующая функция, упорядочивающая некоторым образом все объекты, а на втором этапе для каждого класса выбирается оптимальное значение порога, объекты с одной стороны которого относят к текущему классу, а с другой — нет. Пороги подбираются так, чтобы максимизировать целевую метрику качества. Алгоритмы, свойства которых изучаются в настоящей статье, посвящены второму этапу модульного подхода — выбору оптимального вектора порогов. Этот этап становится нетривиальным в случае использования в качестве целевой метрики качества $F$-меры от средней точности и полноты, так как она не допускает независимую оптимизацию порога в каждом классе. В задачах экстремальной многозначной классификации число классов может достигать сотен тысяч, поэтому исходная оптимизационная задача сводится к задаче поиска неподвижной точки специальным образом введенного отображения $\boldsymbol V$, определенного на единичном квадрате на плоскости средней точности $P$ и полноты $R$. Используя это отображение, для оптимизации предлагаются два алгоритма: метод линеаризации $F$-меры и метод анализа области определения отображения $\boldsymbol V$. На наборах данных многозначной классификации разного размера и природы исследуются свойства алгоритмов, в частности зависимость погрешности от числа классов, от параметра $F$-меры и от внутренних параметров методов. Обнаружена особенность работы обоих алгоритмов для задач с областью определения отображения $\boldsymbol V$, содержащей протяженные линейные участки границ. В случае когда оптимальная точка расположена в окрестности этих участков, погрешности обоих методов не уменьшаются с увеличением количества классов. При этом метод линеаризации достаточно точно определяет аргумент оптимальной точки, а метод анализа области определения отображения $\boldsymbol V$ — полярный радиус.

Ключевые слова: многозначная классификация, экстремальная классификация, $F$-мера, метод линеаризации, метод анализа области определения.

Berger A.I., Guda S.A.
Optimal threshold selection algorithms for multi-label classification: property study
Computer Research and Modeling, 2022, v. 14, no. 6, pp. 1221-1238

Multi-label classification models arise in various areas of life, which is explained by an increasing amount of information that requires prompt analysis. One of the mathematical methods for solving this problem is a plug-in approach, at the first stage of which, for each class, a certain ranking function is built, ordering all objects in some way, and at the second stage, the optimal thresholds are selected, the objects on one side of which are assigned to the current class, and on the other — to the other. Thresholds are chosen to maximize the target quality measure. The algorithms which properties are investigated in this article are devoted to the second stage of the plug-in approach which is the choice of the optimal threshold vector. This step becomes non-trivial if the $F$-measure of average precision and recall is used as the target quality assessment since it does not allow independent threshold optimization in each class. In problems of extreme multi-label classification, the number of classes can reach hundreds of thousands, so the original optimization problem is reduced to the problem of searching a fixed point of a specially introduced transformation $\boldsymbol V$, defined on a unit square on the plane of average precision $P$ and recall $R$. Using this transformation, two algorithms are proposed for optimization: the $F$-measure linearization method and the method of $\boldsymbol V$ domain analysis. The properties of algorithms are studied when applied to multi-label classification data sets of various sizes and origin, in particular, the dependence of the error on the number of classes, on the $F$-measure parameter, and on the internal parameters of methods under study. The peculiarity of both algorithms work when used for problems with the domain of $\boldsymbol V$, containing large linear boundaries, was found. In case when the optimal point is located in the vicinity of these boundaries, the errors of both methods do not decrease with an increase in the number of classes. In this case, the linearization method quite accurately determines the argument of the optimal point, while the method of $\boldsymbol V$ domain analysis — the polar radius.

Keywords: multi-label classification, extreme classification, $F$-measure, linearization method, domain analysis method.
Adekotujo A.S., Enikuomehin T., Aribisala B., Mazzara M., Zubair A.F.
Computational treatment of natural language text for intent detection
Компьютерные исследования и моделирование, 2024, т. 16, № 7, с. 1539-1554

Intent detection plays a crucial role in task-oriented conversational systems. To understand the user’s goal, the system relies on its intent detector to classify the user’s utterance, which may be expressed in different forms of natural language, into intent classes. However, lack of data, and the efficacy of intent detection systems has been hindered by the fact that the user’s intent text is typically characterized by short, general sentences and colloquial expressions. The process of algorithmically determining user intent from a given statement is known as intent detection. The goal of this study is to develop an intent detection model that will accurately classify and detect user intent. The model calculates the similarity score of the three models used to determine their similarities. The proposed model uses Contextual Semantic Search (CSS) capabilities for semantic search, Latent Dirichlet Allocation (LDA) for topic modeling, the Bidirectional Encoder Representations from Transformers (BERT) semantic matching technique, and the combination of LDA and BERT for text classification and detection. The dataset acquired is from the broad twitter corpus (BTC) and comprises various meta data. To prepare the data for analysis, a pre-processing step was applied. A sample of 1432 instances were selected out of the 5000 available datasets because manual annotation is required and could be time-consuming. To compare the performance of the model with the existing model, the similarity scores, precision, recall, f1 score, and accuracy were computed. The results revealed that LDA-BERT achieved an accuracy of 95.88% for intent detection, BERT with an accuracy of 93.84%, and LDA with an accuracy of 92.23%. This shows that LDA-BERT performs better than other models. It is hoped that the novel model will aid in ensuring information security and social media intelligence. For future work, an unsupervised LDA-BERT without any labeled data can be studied with the model.

Ключевые слова: hate speech, intent classification, Twitter posts, sentiment analysis, opinion mining, intent identification from Twitter posts.

Adekotujo A.S., Enikuomehin T., Aribisala B., Mazzara M., Zubair A.F.
Computational treatment of natural language text for intent detection
Computer Research and Modeling, 2024, v. 16, no. 7, pp. 1539-1554

Intent detection plays a crucial role in task-oriented conversational systems. To understand the user’s goal, the system relies on its intent detector to classify the user’s utterance, which may be expressed in different forms of natural language, into intent classes. However, lack of data, and the efficacy of intent detection systems has been hindered by the fact that the user’s intent text is typically characterized by short, general sentences and colloquial expressions. The process of algorithmically determining user intent from a given statement is known as intent detection. The goal of this study is to develop an intent detection model that will accurately classify and detect user intent. The model calculates the similarity score of the three models used to determine their similarities. The proposed model uses Contextual Semantic Search (CSS) capabilities for semantic search, Latent Dirichlet Allocation (LDA) for topic modeling, the Bidirectional Encoder Representations from Transformers (BERT) semantic matching technique, and the combination of LDA and BERT for text classification and detection. The dataset acquired is from the broad twitter corpus (BTC) and comprises various meta data. To prepare the data for analysis, a pre-processing step was applied. A sample of 1432 instances were selected out of the 5000 available datasets because manual annotation is required and could be time-consuming. To compare the performance of the model with the existing model, the similarity scores, precision, recall, f1 score, and accuracy were computed. The results revealed that LDA-BERT achieved an accuracy of 95.88% for intent detection, BERT with an accuracy of 93.84%, and LDA with an accuracy of 92.23%. This shows that LDA-BERT performs better than other models. It is hoped that the novel model will aid in ensuring information security and social media intelligence. For future work, an unsupervised LDA-BERT without any labeled data can be studied with the model.

Keywords: hate speech, intent classification, Twitter posts, sentiment analysis, opinion mining, intent identification from Twitter posts.
Ахмад У., Иванов В.
Автоматизация построения банков высококачественных концептов с использованием больших языковых моделей и мультимодальных метрик
Компьютерные исследования и моделирование, 2024, т. 16, № 7, с. 1555-1567

Интерпретируемость моделей глубокого обучения стала центром исследований, особенно в таких областях, как здравоохранение и финансы. Модели с «бутылочным горлышком», используемые для выявления концептов, стали перспективным подходом для достижения прозрачности и интерпретируемости за счет использования набора известных пользователю понятий в качестве промежуточного представления перед слоем предсказания. Однако ручное аннотирование понятий не затруднено из-за больших затрат времени и сил. В нашей работе мы исследуем потенциал больших языковых моделей (LLM) для создания высококачественных банков концептов и предлагаем мультимодальную метрику для оценки качества генерируемых концептов. Мы изучили три ключевых вопроса: способность LLM генерировать банки концептов, сопоставимые с существующими базами знаний, такими как ConceptNet, достаточность унимодального семантического сходства на основе текста для оценки ассоциаций концептов с метками, а также эффективность мультимодальной информации для количественной оценки качества генерации концептов по сравнению с унимодальным семантическим сходством концепт-меток. Наши результаты показывают, что мультимодальные модели превосходят унимодальные подходы в оценке сходства между понятиями и метками. Более того, сгенерированные нами концепты для наборов данных CIFAR-10 и CIFAR-100 превосходят те, что были получены из ConceptNet и базовой модели, что демонстрирует способность LLM генерировать высококачественные концепты. Возможность автоматически генерировать и оценивать высококачественные концепты позволит исследователям работать с новыми наборами данных без дополнительных усилий.

Ключевые слова: интерпретируемость, большие языковые модели, нейросети с «бутылочным горлышком», машинное обучение.

Ahmad U., Ivanov V.
Automating high-quality concept banks: leveraging LLMs and multimodal evaluation metrics
Computer Research and Modeling, 2024, v. 16, no. 7, pp. 1555-1567

Interpretability in recent deep learning models has become an epicenter of research particularly in sensitive domains such as healthcare, and finance. Concept bottleneck models have emerged as a promising approach for achieving transparency and interpretability by leveraging a set of humanunderstandable concepts as an intermediate representation before the prediction layer. However, manual concept annotation is discouraged due to the time and effort involved. Our work explores the potential of large language models (LLMs) for generating high-quality concept banks and proposes a multimodal evaluation metric to assess the quality of generated concepts. We investigate three key research questions: the ability of LLMs to generate concept banks comparable to existing knowledge bases like ConceptNet, the sufficiency of unimodal text-based semantic similarity for evaluating concept-class label associations, and the effectiveness of multimodal information in quantifying concept generation quality compared to unimodal concept-label semantic similarity. Our findings reveal that multimodal models outperform unimodal approaches in capturing concept-class label similarity. Furthermore, our generated concepts for the CIFAR-10 and CIFAR-100 datasets surpass those obtained from ConceptNet and the baseline comparison, demonstrating the standalone capability of LLMs in generating highquality concepts. Being able to automatically generate and evaluate high-quality concepts will enable researchers to quickly adapt and iterate to a newer dataset with little to no effort before they can feed that into concept bottleneck models.

Keywords: interpretability, large language models, concept bottleneck models, machine learning.
Полежаев В.А.
Задачи и методы автоматического построения графа цитирований по коллекции научных документов
Компьютерные исследования и моделирование, 2012, т. 4, № 4, с. 707-719

Задача автоматического построения графа цитирования по коллекции научных документов сводится к решению последовательности задач распознавания. Рассматриваются методы решения, их адаптация и объединение в технологическую цепочку, приводятся результаты вычислительных экспериментов для некоторых задач.

Ключевые слова: компьютерныйана лиз текстов, граф цитирований, библиография, метаописания, мэтчинг, связывание, разметка, сегментация.

Polezhaev V.A.
Automated citation graph building from a corpora of scientific documents
Computer Research and Modeling, 2012, v. 4, no. 4, pp. 707-719

In this paper the problem of automated building of a citation graph from a collection of scientific documents is considered as a sequence of machine learning tasks. The overall data processing technology is described which consists of six stages: preprocessing, metainformation extraction, bibliography lists extraction, splitting bibliography lists into separate bibliography records, standardization of each bibliography record, and record linkage. The goal of this paper is to provide a survey of approaches and algorithms suitable for each stage, motivate the choice of the best combination of algorithms, and adapt some of them for multilingual bibliographies processing. For some of the tasks new algorithms and heuristics are proposed and evaluated on the mixed English and Russian documents corpora.

Keywords: text mining, machine learning, information extraction, citation graph, bibliography, matching, record linkage, labeling, segmentation, conditional random fields.
Просмотров за год: 5. Цитирований: 1 (РИНЦ).
Небаба С.Г., Марков Н.Г.
Сверточные нейронные сети семейства YOLO для мобильных систем компьютерного зрения
Компьютерные исследования и моделирование, 2024, т. 16, № 3, с. 615-631

Работа посвящена анализу известных классов моделей сверточных нейронных сетей и исследованию выбранных из них перспективных моделей для детектирования летающих объектов на изображениях. Под детектированием объектов (англ. — Object Detection) здесь понимаются обнаружение, локализация в пространстве и классификация летающих объектов. Комплексное исследование выбранных перспективных моделей сверточных нейронных сетей проводится с целью выявления наиболее эффективных из них для создания мобильных систем компьютерного зрения реального времени. Показано, что наиболее приемлемыми для детектирования летающих объектов на изображениях с учетом сформулированных требований к мобильным системам компьютерного зрения реального времени и, соответственно, к лежащим в их основе моделям сверточных нейронных сетей являются модели семейства YOLO, причем наиболее перспективными следует считать пять моделей из этого семейства: YOLOv4, YOLOv4-Tiny, YOLOv4-CSP, YOLOv7 и YOLOv7-Tiny. Для обучения, валидации и комплексного исследования этих моделей разработан соответствующий набор данных. Каждое размеченное изображение из набора данных включает от одного до нескольких летающих объектов четырех классов: «птица», «беспилотный летательный аппарат самолетного типа», «беспилотный летательный аппарат вертолетного типа» и «неизвестный объект» (объекты в воздушном пространстве, не входящие в первые три класса). Исследования показали, что все модели сверточных нейронных сетей по скорости детектирования объектов на изображении (по скорости вычисления модели) значительно превышают заданное пороговое значение, однако только модели YOLOv4-CSP и YOLOv7, причем только частично, удовлетворяют требованию по точности детектирования (классификации) летающих объектов. Наиболее сложным для детектирования классом объектов является класс «птица». При этом выявлено, что наиболее эффективной по точности классификации является модель YOLOv7, модель YOLOv4-CSP на втором месте. Обе модели рекомендованы к использованию в составе мобильной системы компьютерного зрения реального времени при условии увеличения в созданном наборе данных числа изображений с объектами класса «птица» и дообучения этих моделей с тем, чтобы они удовлетворяли требованию по точности детектирования летающих объектов каждого из четырех классов.

Ключевые слова: детектирование летающих объектов на изображениях, сверточная нейронная сеть, YOLO, мобильная система компьютерного зрения.

Nebaba S.G., Markov N.G.
Convolutional neural networks of YOLO family for mobile computer vision systems
Computer Research and Modeling, 2024, v. 16, no. 3, pp. 615-631

The work analyzes known classes of convolutional neural network models and studies selected from them promising models for detecting flying objects in images. Object detection here refers to the detection, localization in space and classification of flying objects. The work conducts a comprehensive study of selected promising convolutional neural network models in order to identify the most effective ones from them for creating mobile real-time computer vision systems. It is shown that the most suitable models for detecting flying objects in images, taking into account the formulated requirements for mobile real-time computer vision systems, are models of the YOLO family, and five models from this family should be considered: YOLOv4, YOLOv4-Tiny, YOLOv4-CSP, YOLOv7 and YOLOv7-Tiny. An appropriate dataset has been developed for training, validation and comprehensive research of these models. Each labeled image of the dataset includes from one to several flying objects of four classes: “bird”, “aircraft-type unmanned aerial vehicle”, “helicopter-type unmanned aerial vehicle”, and “unknown object” (objects in airspace not included in the first three classes). Research has shown that all convolutional neural network models exceed the specified threshold value by the speed of detecting objects in the image, however, only the YOLOv4-CSP and YOLOv7 models partially satisfy the requirements of the accuracy of detection of flying objects. It was shown that most difficult object class to detect is the “bird” class. At the same time, it was revealed that the most effective model is YOLOv7, the YOLOv4-CSP model is in second place. Both models are recommended for use as part of a mobile real-time computer vision system with condition of additional training of these models on increased number of images with objects of the “bird” class so that they satisfy the requirement for the accuracy of detecting flying objects of each four classes.

Keywords: detection of flying objects in images, convolutional neural network, YOLO, mobile computer vision system.
Стёпкин А.В., Стёпкина А.С.
Алгоритм распознавания простых графов коллективом агентов
Компьютерные исследования и моделирование, 2021, т. 13, № 1, с. 33-45

Исследование, представленное в работе, посвящено проблеме распознавания конечных графов с помощью коллектива агентов. В работе рассматриваются конечные неориентированных графы без петель и кратных ребер. Коллектив агентов состоит из двух агентов-исследователей, которые имеют конечную память, независимую от числа вершин исследуемого ими графа, и используют по две краски каждый (в общей сложности используется три различные краски, так как цвет одной из красок у агентов совпадает), и одного агента-экспериментатора, который обладает конечной, неограниченно растущей внутренней памятью. Агенты-исследователи могут одновременно передвигаться по графу, считывать и изменять метки элементов графа, а также передавать необходимую информацию третьему агенту — агенту-экспериментатору. Агент-экспериментатор — это неподвижный агент, в памяти которого фиксируется результат функционирования агентов-исследователей на каждом шаге и, кроме того, постепенно выстраивается представление исследуемого графа (изначально неизвестного агентам) списком ребер и списком вершин.

В работе подробно описаны режимы работы агентов-исследователей с указанием приоритетности их активации, рассмотрены команды, которыми обмениваются агенты-исследователи с агентом-экспериментатором во время выполнения тех или иных процедур. Также подробно рассмотрены проблемные ситуации, возникающие в работе агентов-исследователей, например окрашивание белой вершины при одновременном попадании двух агентов в одну и ту же вершину или пометка и распознавание ребер перешей- ков (ребра, соединяющие подграфы, распознаваемые различными агентами-исследователями) и так далее. Представлен полный алгоритм работы агента-экспериментатора с подробным описанием процедур обработки полученных от агентов-исследователей сообщений, на основании которых и происходит построение представления исследуемого агентами графа. Также в работе проведен полный анализ временной, емкостной и коммуникационной сложностей построенного алгоритма.

Представленный алгоритм распознавания графов имеет квадратичную (от числа вершин исследуемого графа) временную сложность, квадратичную емкостную сложность и квадратичную коммуникационную сложность. Работа алгоритма распознавания основывается на методе обхода графа в глубину.

Ключевые слова: коллектив агентов, распознавание графов, метод в глубину.

Stepkin A.V., Stepkina A.S.
Algorithm of simple graph exploration by a collective of agents
Computer Research and Modeling, 2021, v. 13, no. 1, pp. 33-45

The study presented in the paper is devoted to the problem of finite graph exploration using a collective of agents. Finite non-oriented graphs without loops and multiple edges are considered in this paper. The collective of agents consists of two agents-researchers, who have a finite memory independent of the number of nodes of the graph studied by them and use two colors each (three colors are used in the aggregate) and one agentexperimental, who has a finite, unlimitedly growing internal memory. Agents-researches can simultaneously traverse the graph, read and change labels of graph elements, and also transmit the necessary information to a third agent — the agent-experimenter. An agent-experimenter is a non-moving agent in whose memory the result of the functioning of agents-researchers at each step is recorded and, also, a representation of the investigated graph (initially unknown to agents) is gradually built up with a list of edges and a list of nodes.

The work includes detail describes of the operating modes of agents-researchers with an indication of the priority of their activation. The commands exchanged between agents-researchers and an agent-experimenter during the execution of procedures are considered. Problematic situations arising in the work of agentsresearchers are also studied in detail, for example, staining a white vertex, when two agents simultaneously fall into the same node, or marking and examining the isthmus (edges connecting subgraphs examined by different agents-researchers), etc. The full algorithm of the agent-experimenter is presented with a detailed description of the processing of messages received from agents-researchers, on the basis of which a representation of the studied graph is built. In addition, a complete analysis of the time, space, and communication complexities of the constructed algorithm was performed.

The presented graph exploration algorithm has a quadratic (with respect to the number of nodes of the studied graph) time complexity, quadratic space complexity, and quadratic communication complexity. The graph exploration algorithm is based on the depth-first traversal method.

Keywords: collective of agents, exploration of graphs, graph traversal.
Кочергин А.В., Холматова З.Ш.
Извлечение персонажей и событий из повествований
Компьютерные исследования и моделирование, 2024, т. 16, № 7, с. 1593-1600

Извлечение событий и персонажей из повествований является фундаментальной задачей при анализе и обработке текста на естественном языке. Методы извлечения событий применяются в самых разных областях — от обобщения различных документов до анализа медицинских записей. Мы определяли события на основе структуры под названием «четыре W» (кто, что, когда, где), чтобы охватить все основные компоненты событий, такие как действующие лица, действия, время и места. В этой статье мы рассмотрели два основных метода извлечения событий: статистический анализ синтаксических деревьев и семантическая маркировка ролей. Хотя эти методы были изучены разными исследователями по отдельности, мы напрямую сравнили эффективность двух подходов на собранном нами наборе данных, который мы разметили.

Наш анализ показал, что статистический анализ синтаксических деревьев превосходит семантическую маркировку ролей при выделении событий и символов, особенно при определении конкретных деталей. Тем не менее, семантическая маркировка ролей продемонстрировала хорошую эффективность при правильной идентификации действующих лиц. Мы оценили эффективность обоих подходов, сравнив различные показатели, такие как точность, отзывчивость и F1-баллы, продемонстрировав, таким образом, их соответствующие преимущества и ограничения.

Более того, в рамках нашей работы мы предложили различные варианты применения методов извлечения событий, которые мы планируем изучить в дальнейшем. Области, в которых мы хотим применить эти методы, включают анализ кода и установление авторства исходного кода. Мы рассматриваем возможность использования методов извлечения событий для определения ключевых элементов кода в виде назначений переменных и вызовов функций, что в дальнейшем может помочь ученым проанализировать поведение программ и определить участников проекта. Наша работа дает новое понимание эффективности статистического анализа и методов семантической маркировки ролей, предлагая исследователям новые направления для применения этих методов.

Ключевые слова: извлечение событий, обработка естественного языка, статистический анализ, семантическая маркировка ролей.

Kochergin A.V., Kholmatova Z.Sh.
Extraction of characters and events from narratives
Computer Research and Modeling, 2024, v. 16, no. 7, pp. 1593-1600

Events and character extraction from narratives is a fundamental task in text analysis. The application of event extraction techniques ranges from the summarization of different documents to the analysis of medical notes. We identify events based on a framework named “four W” (Who, What, When, Where) to capture all the essential components like the actors, actions, time, and places. In this paper, we explore two prominent techniques for event extraction: statistical parsing of syntactic trees and semantic role labeling. While these techniques were investigated by different researchers in isolation, we directly compare the performance of the two approaches on our custom dataset, which we have annotated.

Our analysis shows that statistical parsing of syntactic trees outperforms semantic role labeling in event and character extraction, especially in identifying specific details. Nevertheless, semantic role labeling demonstrate good performance in correct actor identification. We evaluate the effectiveness of both approaches by comparing different metrics like precision, recall, and F1-scores, thus, demonstrating their respective advantages and limitations.

Moreover, as a part of our work, we propose different future applications of event extraction techniques that we plan to investigate. The areas where we want to apply these techniques include code analysis and source code authorship attribution. We consider using event extraction to retrieve key code elements as variable assignments and function calls, which can further help us to analyze the behavior of programs and identify the project’s contributors. Our work provides novel understandings of the performance and efficiency of statistical parsing and semantic role labeling techniques, offering researchers new directions for the application of these techniques.

Keywords: event extraction, natural language processing, statistical parsing, semantic role labeling.
Матвеев А.В.
Математическое моделирование кинетики и расчет дозиметрических характеристик остеотропных радиофармацевтических лекарственных препаратов
Компьютерные исследования и моделирование, 2022, т. 14, № 3, с. 647-660

В отечественной медицине для радионуклидной терапии костных метастазов сегодня применяются два радиофармпрепарата: ⁸⁹Sr-хлорид и ¹⁵³Sm-оксабифор. Первый изних имеет много побочных эффектов, поэтому его применение ограничено. Второй доступен только в клиниках, транспортировка его в которые не занимает много времени. В настоящее время клинические исследования проходит третий радиофармпрепарат — ¹⁸⁸Re-золерен. В связи с генераторным способом получения ¹⁸⁸Re данный радиофармпрепарат должен стать доступным для применения во многих регионах нашей страны. Поэтому возникает необходимость в сравнительном анализе характеристик этих радиофармпрепаратов, в том числе на основе математического моделирования.

В статье рассмотрены особенности математического моделирования кинетики остеотропных радиофармацевтических лекарственных препаратов в организме человека с костными метастазами. На основе четырехкамерной модели разработан и апробирован комплекс моделирования и расчета фармакокинетических и дозиметрических характеристик радиофармпрепаратов для радионуклидной терапии костных метастазов. С использованием клинических данных идентифицированы транспортные константы модели и рассчитаны индивидуальные характеристики отечественных радиофармпрепаратов, меченных ⁸⁹Sr, ¹⁵³Sm и ¹⁸⁸Re (эффективные периоды полувыведения, максимальные активности в камерах и времена их достижения, поглощенные дозы на костные ткани и метастазы, эндостальный слой кости, красный костный мозг, кровь, почки и мочевой пузырь). Получены и проанализированы зависимости «активность–время» для всех камер модели. Проведен сравнительный анализфар макокинетики и дозиметрии трех радиофармпрепаратов (⁸⁹Sr-хлорид, ¹⁵³Sm-оксабифор, ¹⁸⁸Re-золерен).

Из сравнительного анализа фармакокинетических и дозиметрических характеристик этих радиофармацевтических лекарственных препаратов следует, что наилучшим изних для широкого применения во многих регионах нашей страны должен стать ¹⁸⁸Re-золерен с учетом генераторного способа получения ¹⁸⁸Re в условиях стационара.

Ключевые слова: математическое моделирование, ядерная медицина, дозиметрия, кинетика, радиофармпрепарат, камерная модель.

Matveev A.V.
Mathematical modeling the kinetics and calculation of dosimetric characteristics of osteotropic radiopharmaceutical drugs
Computer Research and Modeling, 2022, v. 14, no. 3, pp. 647-660

In Russian medicine two radiopharmaceuticals are currently used for radionuclide therapy of bone metastases: ⁸⁹Sr-chloride and ¹⁵³Sm-oxabifor. The first one has many side effects, so its use is limited. The second one is available only in clinics, its transportation to which does not take much time. Currently, the third radiopharmaceutical ¹⁸⁸Re-solerene is undergoing clinical trials. Due to the generator method of obtaining ¹⁸⁸Re, this radiopharmaceutical should become available for use in many regions of our country. Therefore, there is a need for a comparative analysis of the characteristics of these radiopharmaceuticals, including on the basis of mathematical modeling.

The article discusses the features of mathematical modeling the kinetics of osteotropic radiopharmaceutical drugs in the human body with bone metastases. Based on the four-compartment model, a complex of modeling and calculation of pharmacokinetic and dosimetric characteristics of radiopharmaceuticals for radionuclide therapy of bone metastases was developed and tested. Using clinical data, the transport constants of the model were identified and the individual characteristics of Russian radiopharmaceuticals labeled ⁸⁹Sr, 153Sm and ¹⁸⁸Re were calculated (effective half-lives, maximum activity in the compartments and the times of their achievement, absorbed doses to bone tissue and metastases, endosteal bone layer, red bone marrow, blood, kidneys and bladder). The time activity dependencies for all compartments of the model are obtained and analyzed. A comparative analysis of the pharmacokinetics and dosimetry of three radiopharmaceuticals (⁸⁹Sr-chloride, ¹⁵³Sm-oxabiphore, ¹⁸⁸Re-solerene) was carried out.

From a comparative analysis of the pharmacokinetic and dosimetric characteristics of these radiopharmaceutical drugs, it follows that the best of them for widespread use in many regions of our country should be ¹⁸⁸Re-solerene, taking into account the generator method of obtaining ¹⁸⁸Re in a hospital.

Keywords: mathematical modeling, nuclear medicine, dosimetry, kinetics, radiopharmaceutical, compartment model.
Краснов Ф.В., Смазневич И.С., Баскакова Е.Н.
Метод контрастного семплирования для предсказания библиографических ссылок
Компьютерные исследования и моделирование, 2021, т. 13, № 6, с. 1317-1336

В работе рассматривается задача поиска в научной статье фрагментов с недостающими библиографическими ссылками с помощью автоматической бинарной классификации. Для обучения модели предложен метод контрастного семплирования, новшеством которого является рассмотрение контекста ссылки с учетом границ фрагмента, максимально влияющего на вероятность нахождения в нем библиографической ссылки. Обучающая выборка формировалась из автоматически размеченных семплов — фрагментов из трех предложений с метками классов «без ссылки» и «со ссылкой», удовлетворяющих требованию контрастности: семплы разных классов дистанцируются в исходном тексте. Пространство признаков строилось автоматически по статистике встречаемости термов и расширялось за счет конструирования дополнительных признаков — выделенных в тексте сущностей ФИО, чисел, цитат и аббревиатур.

Проведена серия экспериментов на архивах научных журналов «Правоприменение» (273 статьи) и «Журнал инфектологии» (684 статьи). Классификация осуществлялась моделями Nearest Neighbours, RBF SVM, Random Forest, Multilayer Perceptron, с подбором оптимальных гиперпараметров для каждого классификатора.

Эксперименты подтвердили выдвинутую гипотезу. Наиболее высокую точность показал нейросетевой классификатор (95%), уступающий по скорости линейному, точность которого при контрастном семплировании также оказалась высока (91–94 %). Полученные значения превосходят результаты, опубликованные для задач NER и анализа тональности на данных со сравнимыми характеристиками. Высокая вычислительная эффективность предложенного метода позволяет встраивать его в прикладные системы и обрабатывать документы в онлайн-режиме.

Ключевые слова: контрастное семплирование, анализ цитирования, передискретизация данных, предсказание библиографических ссылок, текстовая классификация, искусственные нейронный сети.

Krasnov F.V., Smaznevich I.S., Baskakova E.N.
Bibliographic link prediction using contrast resampling technique
Computer Research and Modeling, 2021, v. 13, no. 6, pp. 1317-1336

The paper studies the problem of searching for fragments with missing bibliographic links in a scientific article using automatic binary classification. To train the model, we propose a new contrast resampling technique, the innovation of which is the consideration of the context of the link, taking into account the boundaries of the fragment, which mostly affects the probability of presence of a bibliographic links in it. The training set was formed of automatically labeled samples that are fragments of three sentences with class labels «without link» and «with link» that satisfy the requirement of contrast: samples of different classes are distanced in the source text. The feature space was built automatically based on the term occurrence statistics and was expanded by constructing additional features — entities (names, numbers, quotes and abbreviations) recognized in the text.

A series of experiments was carried out on the archives of the scientific journals «Law enforcement review» (273 articles) and «Journal Infectology» (684 articles). The classification was carried out by the models Nearest Neighbors, RBF SVM, Random Forest, Multilayer Perceptron, with the selection of optimal hyperparameters for each classifier.

Experiments have confirmed the hypothesis put forward. The highest accuracy was reached by the neural network classifier (95%), which is however not as fast as the linear one that showed also high accuracy with contrast resampling (91–94%). These values are superior to those reported for NER and Sentiment Analysis on comparable data. The high computational efficiency of the proposed method makes it possible to integrate it into applied systems and to process documents online.

Keywords: contrast resampling, citation analysis, data resampling, link prediction, text classification, artificial neural network.

Страницы: следующая

Журнал индексируется в Scopus

Полнотекстовая версия журнала доступна также на сайте научной электронной библиотеки eLIBRARY.RU

Журнал входит в систему Российского индекса научного цитирования.

Журнал включен в базу данных Russian Science Citation Index (RSCI) на платформе Web of Science

Международная Междисциплинарная Конференция "Математика. Компьютер. Образование"