我需要抓取一个目录,目录是多级展开的,参见:https://www.amazon.com/gp/bestsellers
左侧的每条根目录,点击以后会出现下层目录树。
我想用递归的方法实现:
在使用UiElement.DataScrap命令的时候,我发现规律是:
在selecors的部分,每下降一层,就会重复写一遍,简单的说就是
arrayData=........"selecors":[A, {"tag".........
arrayData2=........"selecors":[A,A, {"tag".........
arrayData3=........"selecors":[A,A,A, {"tag".........
注意标记A的地方,这里的A代表一串字符,具体说就是 {"tag":"div","index":0,"className":"_p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","value":"div._p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","prefix":">"}
那么,问题1是:
我这种后续增多的情况,我要怎么写递归?
以下为原码的数据抓取部分,每次抓取的目录,遍历一下,有下层的,下层遍历
问题2,有人介绍我使用子元素和父元素。我不是特别理解。
每一个层次的元素都不一样,我要如何在打开的那一层目录上取得他是子元素还是父元素?
如果我找到父元素目标,这个元素要怎么使用?网上UIBOT的内容没有说这一点。而且我也没有相关其它的编程基础,直接就用的UIBOT。
问题3,关于https://www.amazon.com/gp/bestsellers,这种目录树,我想要它整个树结构,以及对应的网址,有没有更好的抓取方法?
----------------------
arrayData = UiElement.DataScrap({"wnd":[{"cls":"Chrome_WidgetWin_1","title":"*","app":"chrome"},{"cls":"Chrome_RenderWidgetHostHWND","title":"Chrome Legacy Window"}],"html":[{"tag":"DIV","id":"*"}]},{"ExtractTable":0,"Columns":[{"selecors":[{"tag":"div","index":0,"className":"_p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","value":"div._p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","prefix":""}, {"tag":"div","value":"div","index":0,"prefix":">"},{"tag":"a","index":0,"className":"","value":"a","prefix":">"}],"props":["text","url"]}]},{"objNextLinkElement":"","iMaxNumberOfPage":5,"iMaxNumberOfResult":-1,"iDelayBetweenMS":1000,"bContinueOnError":False})
arrayData2 = UiElement.DataScrap({"wnd":[{"cls":"Chrome_WidgetWin_1","title":"*","app":"chrome"},{"cls":"Chrome_RenderWidgetHostHWND","title":"Chrome Legacy Window"}],"html":[{"tag":"DIV","id":"*"}]},{"ExtractTable":0,"Columns":[{"selecors":[{"tag":"div","index":0,"className":"_p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","value":"div._p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","prefix":""},{"tag":"div","index":0,"className":"_p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","value":"div._p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","prefix":">"}, {"tag":"div","value":"div","index":0,"prefix":">"},{"tag":"a","index":0,"className":"","value":"a","prefix":">"}],"props":["text","url"]}]},{"objNextLinkElement":"","iMaxNumberOfPage":5,"iMaxNumberOfResult":-1,"iDelayBetweenMS":1000,"bContinueOnError":False})
arrayData3 = UiElement.DataScrap({"wnd":[{"cls":"Chrome_WidgetWin_1","title":"*","app":"chrome"},{"cls":"Chrome_RenderWidgetHostHWND","title":"Chrome Legacy Window"}],"html":[{"tag":"DIV","id":"*"}]},{"ExtractTable":0,"Columns":[{"selecors":[{"tag":"div","index":0,"className":"_p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","value":"div._p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","prefix":""},{"tag":"div","index":0,"className":"_p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","value":"div._p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","prefix":">"},{"tag":"div","index":0,"className":"_p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","value":"div._p13n-zg-nav-tree-all_style_zg-browse-group__88fbz","prefix":">"},{"tag":"div","value":"div","index":0,"prefix":">"},{"tag":"a","index":0,"className":"","value":"a","prefix":">"}],"props":["text","url"]}]},{"objNextLinkElement":"","iMaxNumberOfPage":5,"iMaxNumberOfResult":-1,"iDelayBetweenMS":1000,"bContinueOnError":False})