关于目录的遍历

冯润阳 2022-7-27 561

我需要抓取一个目录,目录是多级展开的,参见:https://www.amazon.com/gp/bestsellers

左侧的每条根目录,点击以后会出现下层目录树。

我想用递归的方法实现:

在使用UiElement.DataScrap命令的时候,我发现规律是:

在selecors的部分,每下降一层,就会重复写一遍,简单的说就是

arrayData=........"selecors":[A, {"tag".........

arrayData2=........"selecors":[A,A, {"tag".........

arrayData3=........"selecors":[A,A,A, {"tag".........

注意标记A的地方,这里的A代表一串字符,具体说就是 {"tag":"div","index":0,"className":"_p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","value":"div._p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","prefix":">"}


那么,问题1是:

我这种后续增多的情况,我要怎么写递归?

以下为原码的数据抓取部分,每次抓取的目录,遍历一下,有下层的,下层遍历


问题2,有人介绍我使用子元素和父元素。我不是特别理解。

每一个层次的元素都不一样,我要如何在打开的那一层目录上取得他是子元素还是父元素?

如果我找到父元素目标,这个元素要怎么使用?网上UIBOT的内容没有说这一点。而且我也没有相关其它的编程基础,直接就用的UIBOT。


问题3,关于https://www.amazon.com/gp/bestsellers,这种目录树,我想要它整个树结构,以及对应的网址,有没有更好的抓取方法?


----------------------

arrayData = UiElement.DataScrap({"wnd":[{"cls":"Chrome_WidgetWin_1","title":"*","app":"chrome"},{"cls":"Chrome_RenderWidgetHostHWND","title":"Chrome Legacy Window"}],"html":[{"tag":"DIV","id":"*"}]},{"ExtractTable":0,"Columns":[{"selecors":[{"tag":"div","index":0,"className":"_p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","value":"div._p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","prefix":""}, {"tag":"div","value":"div","index":0,"prefix":">"},{"tag":"a","index":0,"className":"","value":"a","prefix":">"}],"props":["text","url"]}]},{"objNextLinkElement":"","iMaxNumberOfPage":5,"iMaxNumberOfResult":-1,"iDelayBetweenMS":1000,"bContinueOnError":False})


arrayData2 = UiElement.DataScrap({"wnd":[{"cls":"Chrome_WidgetWin_1","title":"*","app":"chrome"},{"cls":"Chrome_RenderWidgetHostHWND","title":"Chrome Legacy Window"}],"html":[{"tag":"DIV","id":"*"}]},{"ExtractTable":0,"Columns":[{"selecors":[{"tag":"div","index":0,"className":"_p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","value":"div._p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","prefix":""},{"tag":"div","index":0,"className":"_p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","value":"div._p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","prefix":">"},     {"tag":"div","value":"div","index":0,"prefix":">"},{"tag":"a","index":0,"className":"","value":"a","prefix":">"}],"props":["text","url"]}]},{"objNextLinkElement":"","iMaxNumberOfPage":5,"iMaxNumberOfResult":-1,"iDelayBetweenMS":1000,"bContinueOnError":False})


arrayData3 = UiElement.DataScrap({"wnd":[{"cls":"Chrome_WidgetWin_1","title":"*","app":"chrome"},{"cls":"Chrome_RenderWidgetHostHWND","title":"Chrome Legacy Window"}],"html":[{"tag":"DIV","id":"*"}]},{"ExtractTable":0,"Columns":[{"selecors":[{"tag":"div","index":0,"className":"_p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","value":"div._p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","prefix":""},{"tag":"div","index":0,"className":"_p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","value":"div._p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","prefix":">"},{"tag":"div","index":0,"className":"_p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","value":"div._p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","prefix":">"},{"tag":"div","value":"div","index":0,"prefix":">"},{"tag":"a","index":0,"className":"","value":"a","prefix":">"}],"props":["text","url"]}]},{"objNextLinkElement":"","iMaxNumberOfPage":5,"iMaxNumberOfResult":-1,"iDelayBetweenMS":1000,"bContinueOnError":False})


最新回复 (9)
  • 果子哩 2022-7-27
    2
    问题总结:
    第一步:先使用获取子元素,获取到第一层级的标题
    第二步:遍历子元素,如子元素中带有标题名,可提取出来,如未带有,则使用获取元素文本获取到
    第三步:根据遍历的子元素,再获取子元素,然后重复第二步
    如使用递归,可将遍历子元素写成递归函数,调用即可
  • life 2022-7-27
    3
    自写js即可
  • rainvale 2022-7-27
    4
    论坛瞌睡虫子大佬分析了个runjs插件,能实现抓取页面数据
  • 驿站工作室 2022-7-27
    5



    试一下这个插件,几行代码搞定

  • 冯润阳 2022-7-27
    6
    rainvale 论坛瞌睡虫子大佬分析了个runjs插件,能实现抓取页面数据
    我在命令中心没找到
  • 冯润阳 2022-7-27
    7
    驿站工作室 试一下这个插件,几行代码搞定
    请问这是哪个插件?
  • 驿站工作室 2022-7-27
    8
    冯润阳 请问这是哪个插件?
    在社区搜相似元素
  • 冯润阳 2022-7-27
    9
    rainvale 论坛瞌睡虫子大佬分析了个runjs插件,能实现抓取页面数据
    我看了下,好难,不会用。
  • 冯润阳 2022-7-27
    10
    驿站工作室 在社区搜相似元素
    谢谢,相似元素的问题能解决,而且很方便。
    只是这里需要复制过去,那么,每一次递归要怎么找UiElement.DataScrap,是不是我用的不对?
返回
发新帖