跳到主要内容

BrowseCrawler — 整树离线爬取

概述

IDE / HMI 联机时常需要把 AddressSpace 子树拉到本地, 后续节点搜索无需再发 Browse RPC。BrowseCrawler 用 BFS 队列遍历, 双重保护 (max_depth + max_nodes) 防爆。

API

成员类别读写说明
BrowseCrawler(ua, max_depth, max_nodes, node_class_filter)构造构造爬取器
crawl_async(root_node_id, progress, cancel_event)方法异步 BFS, 返回 CrawlResult
CrawlResult.all_nodes属性扁平节点列表
CrawlResult.children_by_parent属性父子关系字典
CrawlResult.elapsed属性总耗时 (timedelta)

代码示例

import asyncio
from opcua import DarraOpcUa, BrowseCrawler, NodeClass, WellKnownNodes

async def main():
with DarraOpcUa("opc.tcp://localhost:4840") as ua:
ua.connect()

# 1) 爬整棵 ObjectsFolder
crawler = BrowseCrawler(ua, max_depth=8, max_nodes=50_000)

def on_progress(count, current_nid):
print(f"\r已抓 {count} 个, 当前 {current_nid}", end="")

result = await crawler.crawl_async(
WellKnownNodes.OBJECTS_FOLDER, progress=on_progress)

print(f"\n共 {len(result.all_nodes)} 节点, 耗时 {result.elapsed.total_seconds():.1f}s")

# 2) 仅 Variable 节点
c2 = BrowseCrawler(ua, node_class_filter=NodeClass.VARIABLE)
r2 = await c2.crawl_async("ns=2;s=Boilers")
for n in r2.all_nodes:
print(f" {n.browse_name} ({n.node_id})")

# 3) 取消支持
cancel = asyncio.Event()
async def cancel_after(sec):
await asyncio.sleep(sec)
cancel.set()
asyncio.create_task(cancel_after(5))
r3 = await crawler.crawl_async("i=85", cancel_event=cancel)
print(f"取消时已抓 {len(r3.all_nodes)} 个")

asyncio.run(main())

性能

节点规模耗时 (千兆 LAN)
1,000~0.5s
10,000~5s
50,000~25s

最佳实践

  • 限定 root, 不要从 i=84 (Root) 开始
  • node_class_filter 节省内存
  • 启动时一次性爬, 后续从本地查
  • 给用户"取消"按钮

跨语言对照

C#PythonJavaC++RustC
BrowseCrawlerBrowseCrawlerBrowseCrawlerBrowseCrawlerBrowseCrawlerDarraUa_BrowseCrawler_*
CrawlAsynccrawl_asynccrawlAsyncCrawlAsynccrawl_asyncDarraUa_BrowseCrawler_Crawl

下一步