Added the ability for pure xml to be returned from either a .all() or a .search(). #11

djm4686 · 2017-05-08T19:44:06Z

This allows for potentially much faster searching through the data, as you don't have to wait for each element in the tree to build into an object.

djm4686 · 2017-05-08T19:49:42Z

This is my first actual open source PR (woweee!) let me know if I fudged something.

matejv · 2017-05-10T07:56:41Z

Thanks for your input. I do have a couple of questions:

What is your use-case for this? A major motivator in creating pynetdot for me was to not have to parse XML for each request. Without it pynetdot is just a thin HTTP client.
Time spent parsing XML should be minuscule compared to waiting for a response from netdot API. Have you noticed different behaviour?

I'm not against merging this, but I would like to better understand the motivation behind it.

djm4686 · 2017-05-10T17:33:09Z

The company I work for has over 400 routers in netdot inventory. I am making a script to pull the router hostnames from netdot so I can connect to them remotely and run commands.

This library parses the xml and creates an object for each element in the xml DOM. With the amount of routers I have to pull (which will continually go up) this takes some time to create the object for each element. I timed it to be roughly .1-.2 seconds each, not too much but with 400+ routers that gets slow very fast, especially considering we want this project I'm working on to run at least once every 10 mins.

There was only a small bit of information I needed to grab from the netdot API (hostname).

With my changes, I was able to parse and get the information I needed with:

root = xml.etree.ElementTree.fromstring(pynetdot.Device._search(pure_xml=True))
routers = []
for el in root:
    name = el.attrib["name"]
    if re.match(regex, name):
        routers.append(name)

This takes roughly 5-6 seconds with the amount of routers we have compared to the ~2 mins it took with the xml parsing built in.

matejv · 2017-05-11T06:37:06Z

Ooohhh I see what your problem is now!

It's not actually the XML parsing that is taking the time. In fact a device name attribute is not a plain string in netdot but a reference to antother type of resource - a DNS record. So when you try to do device.name pynetdot makes another HTTP request in the background to the netdot API to retrieve that resource. So you get all devices in the first HTTP request but then make another one for each of your 400 devices to get the hostnames. This obviously takes time.

Now, fortunately you can access the hostname without hitting the API again: device._attrs is a dict with all XML attributes already parsed. You can try this code:

devices = pynetdot.Device.all()
names = []
for device in devices:
    name.append(device._attrs['name'])

I'm sure you'll find this method is much faster.

djm4686 · 2017-05-11T13:10:31Z

+1, that worked. Though I have to ask... if you already have the info in the dict why bind the property to an API call?

matejv · 2017-05-11T13:59:53Z

Well the name attribute of a device is another type of resource in netdot (DNS resource record). So when you do device.name you're actually accessing an object and that object needs to be populated from the API. But since you're using it in string context it gets converted to a string automatically.

Now when you call the API to get devices, the response XML already has the string representation of that DNS resource record object - and that gets stored in device._attrs['name']. But I only have that info from that object. DNS resource record objects have other attributes that you could use, so I have to load them when you first access that object.

I know this is not an optimal solution and I need to find a better way to handle it. Especially device names are such a common use-case that pynetdot really should not be accessing the API for every device. I do have some ideas how to solve this, but I need some time to test.

Hope my explanations makes some sense :)

Daniel Madden added 3 commits May 8, 2017 14:48

Removed xml parsing form searching for increase speed

37fcfa9

Added kwargs flag for getting pure xml from .search()

b5f1e6c

Added teh ability for pure xml to be retrieved from an all() or search()

7a0772b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added the ability for pure xml to be returned from either a .all() or a .search(). #11

Added the ability for pure xml to be returned from either a .all() or a .search(). #11

djm4686 commented May 8, 2017

djm4686 commented May 8, 2017

matejv commented May 10, 2017

djm4686 commented May 10, 2017 •

edited

Loading

matejv commented May 11, 2017

djm4686 commented May 11, 2017

matejv commented May 11, 2017

Added the ability for pure xml to be returned from either a .all() or a .search(). #11

Are you sure you want to change the base?

Added the ability for pure xml to be returned from either a .all() or a .search(). #11

Conversation

djm4686 commented May 8, 2017

djm4686 commented May 8, 2017

matejv commented May 10, 2017

djm4686 commented May 10, 2017 • edited Loading

matejv commented May 11, 2017

djm4686 commented May 11, 2017

matejv commented May 11, 2017

djm4686 commented May 10, 2017 •

edited

Loading