|
|
Parsing and Layout Insights of Agentic Document and Information Service Technology Based on IARPA's Project Announcements |
Fu Yun1, Liu Xiwen1,2 |
1.National Science Library, Chinese Academy of Sciences, Beijing 100190 2.Department of Information Resources Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190 |
|
|
Abstract Clarifying the connotation and scope of Agentic Document and Information Service Technology (ADIST) is crucial for the strategic planning of Document and Information Service (DIS) institutes and the research direction of scholars. This study explores a fine-grained technical content analysis of 87 project announcements published by IARPA to systematically reveal the structure and characteristics of ADIST across four hierarchical dimensions: topic, subtopic, technical issue, and evaluation metric. Accordingly, this study identifies what ADIST is and its key components. To achieve this, a technology layout analysis framework based on project announcements was proposed. A Project Description Model (PDM) was designed incorporating three categories of knowledge elements: research objectives, technical questions, and evaluation metrics. A project announcement text-parsing prompt tailored for PDM was developed, achieving an average recognition accuracy of 92.94% for the three knowledge elements when applied in GPT-4o (Generative Pretrained Transformer 4 Omni). Additionally, TopicGPT was used to construct hierarchical topics (topic-subtopic) based on the summary and research objective texts of project announcements. This study further reveals the technical layout content and characteristics by integrating hierarchical topics, technical issues, and evaluation metrics. The analysis concludes that ADIST refers to intelligent technologies applied to DIS tasks and information technologies that enhance the agentic-driven transformation of the DIS workflow. It encompasses four key aspects—intelligent data, intelligent computing, intelligent cognition, and intelligent systems—with four core characteristics and four evaluation principles. Case studies further validate the reliability of these findings. Finally, guided by a broad DIS perspective, this study proposes six key future-oriented tasks for ADIST development. These include three key technical research challenges: intelligent analysis and cognitive modeling for DIS application scenarios, goal-driven multimodal DIS data production and organization, and intelligent information technologies for complex DIS scenario computation and analysis. Additionally, three core practical application challenges are identified: standardization and automation of DIS workflows, usability and dissemination of DIS tools and analytical results, and determinacy and measurability of DIS evaluation frameworks.
|
Received: 14 September 2024
|
|
|
|
1 King R D, Rowland J, Oliver S G, et al. The automation of science[J]. Science, 2009, 324(5923): 85-89. 2 Evans J, Rzhetsky A. Machine science[J]. Science, 2010, 329(5990): 399-400. 3 OpenAI. GPT-4 technical report[OL]. (2024-03-04) [2024-08-26]. https://arxiv.org/pdf/2303.08774. 4 Touvron H, Lavril T, Izacard G, et al. LLaMA: open and efficient foundation language models[OL]. (2023-02-27) [2024-08-26]. https://arxiv.org/pdf/2302.13971v1. 5 Wang H C, Fu T F, Du Y Q, et al. Scientific discovery in the age of artificial intelligence[J]. Nature, 2023, 620(7972): 47-60. 6 Boiko D A, MacKnight R, Kline B, et al. Autonomous chemical research with large language models[J]. Nature, 2023, 624(7992): 570-578. 7 Bran A M, Cox S, Schilter O, et al. Augmenting large language models with chemistry tools[J]. Nature Machine Intelligence, 2024, 6(5): 525-535. 8 Lu C, Lu C, Lange R T, et al. The AI Scientist: towards fully automated open-ended scientific discovery[OL]. (2024-08-12) [2024-08-14]. https://arxiv.org/pdf/2408.06292v1. 9 King R, Zenil H. A framework for evaluating the AI-driven automation of science[R]// Artificial Intelligence in Science: Challenges, Opportunities and the Future of Research. Paris: OECD Publishing, 2023: 113-120. 10 Ziems C, Held W, Shaikh O, et al. Can large language models transform computational Social Science?[J]. Computational Linguistics, 2024, 50(1): 237-291. 11 赵纯厚. 情报科学与情报技术的发展[J]. 情报学报, 1989, 8(5): 393-400, 392. 12 李广建, 黄永文, 孔敬, 等. 数字时代的情报技术[J]. 数字图书馆论坛, 2006(10): 61-71. 13 刘细文, 付芸. 数智赋能背景下情报学研究进展——数据驱动、模型驱动与知识发现[M]// 情报学进展: 2022—2023年度评论(第十五卷). 北京: 兵器工业出版社, 2024: 131-171. 14 LandingAI. Andrew Ng: a look at AI agentic workflows and their potential for driving AI progress[EB/OL]. (2024-06-13) [2024-08-26]. https://landing.ai/videos/andrew-ng-a-look-at-ai-agentic-workflows-and-their-potential-for-driving-ai-progress. 15 赵志耘, 孙星恺, 王晓, 等. 组织情报组织智能与系统情报系统智能: 从基于情景的情报到基于模型的情报[J]. 情报学报, 2020, 39(12): 1283-1294. 16 刘细文, 孙蒙鸽, 王茜, 等. DIKIW逻辑链下GPT大模型对文献情报工作的潜在影响分析[J]. 图书情报工作, 2023, 67(21): 3-12. 17 张海涛, 栾宇, 周红磊, 等. 总体国家安全观下重大突发事件的智能决策情报体系研究[J]. 情报学报, 2022, 41(11): 1174-1187. 18 李广建, 潘佳立. 人工智能技术赋能情报工作的历程与当前思考[J]. 信息资源管理学报, 2024, 14(2): 4-20. 19 韩旭, 孙亚伟, 赵璐. 体系化人工智能与大语言模型在智能情报场景中的应用[J]. 北京邮电大学学报, 2024, 47(4): 11-19, 28. 20 汤珊红, 李晓松, 赵柯然, 等. 生成式人工智能赋能国防科技情报[J]. 情报理论与实践, 2023, 46(11): 81-85, 99. 21 Porter A L, Zhang Y, Newman N C. Tech mining: a revisit and navigation[J]. Frontiers in Research Metrics and Analytics, 2024, 9: 1364053. 22 Zeng A, Shen Z S, Zhou J L, et al. The science of science: from the perspective of complex systems[J]. Physics Reports, 2017, 714: 1-73. 23 Wang L L, Wang X W, Piro F N, et al. The effect of competitive public funding on scientific output: a comparison between China and the EU[J]. Research Evaluation, 2020, 29(4): 418-429. 24 Aagaard K, Mongeon P, Ramos-Vielba I, et al. Getting to the bottom of research funding: acknowledging the complexity of funding dynamics[J]. PLoS One, 2021, 16(5): e0251488. 25 Jin Q Q, Chen H S, Wang X M, et al. Exploring funding patterns with word embedding-enhanced organization-topic networks: a case study on big data[J]. Scientometrics, 2022, 127(9): 5415-5440. 26 Rogers Hollingsworth J, Gear D M. The rise and decline of hegemonic systems of scientific creativity[M]// Exceptional Creativity in Science and Technology: Individuals, Institutions, and Innovations. West Conshohocken: Templeton Press, 2013: 25-52. 27 IARPA. History[EB/OL]. [2024-08-26]. https://www.iarpa.gov/who-we-are/history. 28 Fortunato S, Bergstrom C T, B?rner K, et al. Science of science[J]. Science, 2018, 359(6379): eaao0185. 29 初钊鹏, 李扬, 刘昌新. 全球太阳能专利技术竞争格局与发展趋势研究[J]. 情报学报, 2018, 37(3): 262-273. 30 冯仁涛, 余翔. 中国区域技术布局的相似性分析[J]. 情报杂志, 2013, 32(1): 129-134. 31 董坤, 郭锐, 田常伟, 等. 面向产业关键核心技术精准布局的情报支持框架研究[J]. 情报理论与实践, 2024, 47(4): 105-113. 32 孙玉涛, 王祺, 张晨. 合作视角的自动驾驶企业技术布局策略[J]. 科研管理, 2024, 45(1): 162-171. 33 李颖, 李骄阳, 曹羽飞. 美国国际网络空间与数字政策战略布局论析及对我国启示研究[J/OL]. 情报理论与实践, (2024-11-08). http://kns.cnki.net/kcms/detail/11.1762.G3.20241108.1055.002.html. 34 厉娜, 王云飞, 初志勇. 美国海军小企业资助项目的布局[J]. 科技导报, 2020, 38(8): 13-20. 35 Lin M, Li T X, Wang T, et al. Enhancing the accessibility of text data in decision making for capability-based planning using ontology: a perspective of semantic compliance[J]. Knowledge Organization, 2024, 51(4): 227-248. 36 陈艺, 秦琪, 赵龙, 等. 新型储能技术的中国专利布局分析[J]. 储能科学与技术, 2024, 13(6): 2089-2098. 37 胥彦玲, 刘鲁静. 智能制造中工业物联网前沿技术的布局特征分析[J]. 科技管理研究, 2024, 44(13): 161-168. 38 Chuan L M, Qi S J, Zhang H, et al. International development trends in the field of agricultural resources and the environment[J]. Sustainability, 2024, 16(15): 6516. 39 Thelwall M, Simrick S, Viney I, et al. What is research funding, how does it influence research, and how is it recorded? key dimensions of variation[J]. Scientometrics, 2023, 128(11): 6085-6106. 40 Arias-Navarro C, Panagos P, Jones A, et al. Forty years of soil research funded by the European Commission: trends and future. A systematic review of research projects[J]. European Journal of Soil Science, 2023, 74(5): e13423. 41 Kropp K, Larsen A G. Changing the topics: the social sciences in EU-funded research projects[J]. Comparative European Politics, 2023, 21(2): 176-207. 42 Khaksar W, Saplacan D, Bygrave L A, et al. Robotics in elderly healthcare: a review of 20 recent research projects[OL]. (2023-02-09) [2024-08-26]. https://arxiv.org/pdf/2302.04478. 43 Br?er C, Veltkamp G, Ayuandini S, et al. Negotiating policy ideas: participatory action research projects across five European countries[J]. Ethics, Medicine and Public Health, 2023, 28: 100905. 44 Yao Z Y, Zhang H, Chen T H, et al. Differential topic selections and wording behaviors among funded environmental projects with stakeholders: invited paper[C]// Proceedings of the 2024 Conference on AI, Science, Engineering, and Technology. Piscataway: IEEE, 2024: 123-128. 45 Wang F F, Guo W H, Xue R, et al. Exploring the subject heterogeneity of scientific research projects funding-example of the Chinese natural science foundation[J]. Information Processing & Management, 2025, 62(4): 104098. 46 Barbosa M W, Gomes A. The interface between research funding and environmental policies in an emergent economy using neural topic modeling: proposals for a research agenda[J/OL]. Review of Policy Research, (2024-09-16). https://doi.org/10.1111/ropr.12630. 47 Katie O, Angel L J T, Euan A, et al. Assessing the overlap between UK government knowledge priorities and funder portfolios[J/OL]. [2024-08-26]. http://dx.doi.org/10.2139/ssrn.4998285. 48 Zhang Y, Zhang G Q, Zhu D H, et al. Scientific evolutionary pathways: identifying and visualizing relationships for scientific topics[J]. Journal of the Association for Information Science and Technology, 2017, 68(8): 1925-1939. 49 Liu D R, Hsu C. Project-based knowledge maps: combining project mining and XML-enabled topic maps[J]. Internet Research, 2004, 14(3): 254-266. 50 张辉, 贾倩, 赵静娟, 等. 美国NSF生物科学领域探索性项目研究布局及对我国科研资助的启示[J]. 中国生物工程杂志, 2025, 45(1): 110-122. 51 严金明, 蒲金芳, 夏方舟. 近十年土地科学领域研究方向、热点与展望——基于国家社会科学基金选题指南和立项项目的分析[J]. 中国土地科学, 2023, 37(8): 1-12. 52 Avilés-Santa M L, Hsu L, Lam T K, et al. Funding of hispanic/Latino health-related research by the national institutes of health: an analysis of the portfolio of research program grants on six health topic areas[J]. Frontiers in Public Health, 2020, 8: 330. 53 Pham C M, Hoyle A, Sun S M, et al. TopicGPT: a prompt-based topic modeling framework[C]// Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2024: 2956-2984. 54 Liu H Y, Ahmed S, Passani A, et al. Understanding the role of cities and citizen science in advancing sustainable development goals across Europe: insights from European research framework projects[J]. Frontiers in Sustainable Cities, 2023, 5: 1219768. 55 Pappot H, Steen-Olsen E B, Holl?nder-Mieritz C. Experiences with wearable sensors in oncology during treatment: lessons learned from feasibility research projects in Denmark[J]. Diagnostics, 2024, 14(4): 405. 56 张家年, 卓翔芝, 谢阳群, 等. 创新情报科技保障国家安全: IARPA的研究机制与启示[J]. 情报杂志, 2016, 35(1): 1-7. 57 李涵宇, 李景龙. 美国情报高级研究计划局项目管理研究[J]. 情报杂志, 2018, 37(9): 38-42, 77. 58 Azoulay P, Fuchs E, Goldstein A P, et al. Funding breakthrough research: promises and challenges of the “ARPA model”[J]. Innovation Policy and the Economy, 2019, 19: 69-96. 59 李国俊, 王延飞, 徐扬. 美国预见性情报工作的组织管理及启示研究[J]. 信息资源管理学报, 2024, 14(3): 80-89. 60 索传军, 盖双双, 周志超. 认知计算——单篇学术论文评价的新视角[J]. 中国图书馆学报, 2018, 44(1): 50-61. 61 Evans J A, Foster J G. Metaknowledge[J]. Science, 2011, 331(6018): 721-725. 62 陈白雪, 屈宝强, 崔小委, 等. 科技计划项目申报指南资源描述框架研究[J]. 中国科技资源导刊, 2019, 51(5): 40-47. 63 OpenAI. Hello GPT-4o[EB/OL]. (2024-05-13) [2024-08-26]. https://openai.com/index/hello-gpt-4o/. 64 Guo T C, Chen X Y, Wang Y Q, et al. Large language model based multi-agents: a survey of progress and challenges[C]// Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence, 2024: 8048-8057. 65 Lopez P, Foppiano L, Zholudev V, et al. Kermitt2/GROBID[CP/OL]. [2024-09-26]. https://github.com/kermitt2/grobid. 66 张掌然. 问题结构解析[J]. 中州学刊, 2006(1): 171-174. 67 OpenAI. OpenAI官方提示工程指南[EB/OL]. [2024-08-26]. https://prompts.fresns.cn/guide/. 68 Palantir. Palantir foundry: the ontology-powered operating system for the modern enterprise[EB/OL]. [2024-12-26]. https://www.palantir.com/platforms/foundry/. 69 Gunning D, Vorm E, Wang J Y, et al. DARPA’s explainable AI (XAI) program: a retrospective[J]. Applied AI Letters, 2021, 2(4): e61. 70 NSA. Computer & analytic sciences research[EB/OL]. [2024-12-26]. https://www.nsa.gov/Research/Computer-and-Analytic-Sciences-Research/. |
|
|
|