Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the ability for pure xml to be returned from either a .all() or a .search(). #11

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

djm4686
Copy link

@djm4686 djm4686 commented May 8, 2017

This allows for potentially much faster searching through the data, as you don't have to wait for each element in the tree to build into an object.

@djm4686
Copy link
Author

djm4686 commented May 8, 2017

This is my first actual open source PR (woweee!) let me know if I fudged something.

@matejv
Copy link
Member

matejv commented May 10, 2017

Thanks for your input. I do have a couple of questions:

  • What is your use-case for this? A major motivator in creating pynetdot for me was to not have to parse XML for each request. Without it pynetdot is just a thin HTTP client.
  • Time spent parsing XML should be minuscule compared to waiting for a response from netdot API. Have you noticed different behaviour?

I'm not against merging this, but I would like to better understand the motivation behind it.

@djm4686
Copy link
Author

djm4686 commented May 10, 2017

The company I work for has over 400 routers in netdot inventory. I am making a script to pull the router hostnames from netdot so I can connect to them remotely and run commands.

This library parses the xml and creates an object for each element in the xml DOM. With the amount of routers I have to pull (which will continually go up) this takes some time to create the object for each element. I timed it to be roughly .1-.2 seconds each, not too much but with 400+ routers that gets slow very fast, especially considering we want this project I'm working on to run at least once every 10 mins.

There was only a small bit of information I needed to grab from the netdot API (hostname).

With my changes, I was able to parse and get the information I needed with:

root = xml.etree.ElementTree.fromstring(pynetdot.Device._search(pure_xml=True))
routers = []
for el in root:
    name = el.attrib["name"]
    if re.match(regex, name):
        routers.append(name)

This takes roughly 5-6 seconds with the amount of routers we have compared to the ~2 mins it took with the xml parsing built in.

@matejv
Copy link
Member

matejv commented May 11, 2017

Ooohhh I see what your problem is now!

It's not actually the XML parsing that is taking the time. In fact a device name attribute is not a plain string in netdot but a reference to antother type of resource - a DNS record. So when you try to do device.name pynetdot makes another HTTP request in the background to the netdot API to retrieve that resource. So you get all devices in the first HTTP request but then make another one for each of your 400 devices to get the hostnames. This obviously takes time.

Now, fortunately you can access the hostname without hitting the API again: device._attrs is a dict with all XML attributes already parsed. You can try this code:

devices = pynetdot.Device.all()
names = []
for device in devices:
    name.append(device._attrs['name'])

I'm sure you'll find this method is much faster.

@djm4686
Copy link
Author

djm4686 commented May 11, 2017

+1, that worked. Though I have to ask... if you already have the info in the dict why bind the property to an API call?

@matejv
Copy link
Member

matejv commented May 11, 2017

Well the name attribute of a device is another type of resource in netdot (DNS resource record). So when you do device.name you're actually accessing an object and that object needs to be populated from the API. But since you're using it in string context it gets converted to a string automatically.

Now when you call the API to get devices, the response XML already has the string representation of that DNS resource record object - and that gets stored in device._attrs['name']. But I only have that info from that object. DNS resource record objects have other attributes that you could use, so I have to load them when you first access that object.

I know this is not an optimal solution and I need to find a better way to handle it. Especially device names are such a common use-case that pynetdot really should not be accessing the API for every device. I do have some ideas how to solve this, but I need some time to test.

Hope my explanations makes some sense :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants