JSON Parsing and DirecTV DVR
the8thst
Posts: 470
Does anyone have a working JSON Parser built in Netlinx?
Creating a module for 2-way control of DirecTV HR boxes is next on my todo list and all of their replies are in the standard JSON format.
It would be really nice to not have to reinvent the wheel for a web standard.
Thanks.
Creating a module for 2-way control of DirecTV HR boxes is next on my todo list and all of their replies are in the standard JSON format.
It would be really nice to not have to reinvent the wheel for a web standard.
Thanks.
0
Comments
I have a working parser - however, it seems to be painstakingly slow (10 seconds to parse a 17,000+ character response) on an NI-2000. Many of the parsers I've seen, including the Duet parser in AMX Tools, does it character by character - which is where I started, however I've removed doing the evaluation of all characters in the parsing of keys and their values. The Duet version *seems* to parse the returned string from the DTV much more efficiently and *MUCH* more quickly as it says "parsing" (when turning on DEBUG-4) only takes about a second - quite a bit in speed I'd say.
Currently, I'm working on a generic parser which sends a string back for every key / value. For example, this is the output of querying program info:
The code I have does not store any information whatsoever, it merely parses and the passes back the output.
I have not had any time to work on the JSON parsing in Netlinx since starting this thread. I really wish we had regular expressions in Netlinx as it would make the task much easier, but we don't.
My idea was to use find_string to locate beginning and ending tokens of each data pair to create pointers. Then a little bit of simple math on the pointer values will let you be able to find and pull out nested values and embedded arrays.
I think writing this module in Duet is the ideal way to do it, but I don't have a Duet license or the money and time to purchase a license and teach myself duet.
I think you are working on Duet a little bit, so my recommendation is to have your Netlinx module save the JSON responses to a file, and then pass the file name and location to a Duet based JSON parser which will send the sanitized data pairs and arrays back to Netlinx in standard string events.
There are a lot of different examples online for how to easily parse JSON/RSS/XML, etc in Java.
It would be extremely nice to have a couple Duet Parser libraries for the standard data structures used on the web so we can quickly send a file to Duet for parsing and get the results back in Netlinx.
I know of someone on another site that wrote their own JSON parser, should ask them what their times are like when processing large files. Parsing small responses is fine - it's the large response that take too long.
It turns out that the first rev (character-by-character) is a full second and a half faster than than the search & destroy. Unfortunately I can't quite think of another way might be a faster Netlinx implementation. Mind you, these were run on an NI-2000 running 3.60.453 FW. I ran the char-by-char one on an NI-900 and it was 2 seconds faster than my NI-2000. If I can't seem to figure a faster way, I'll probably wind up posting both versions here and allow for collaboration on them. For now though, I'm taking a break on this and working on real work.
Each test was ran 5 times. Here are the results.
Test 1 - 454 Character JSON Response (Program Info Request)
Parser 1 (Character-by-Character): Average 0.171 seconds
Parser 2 (Search & Destroy) : Average 0.149 seconds
Test 2 - 17,329 Character JSON Response (DVR Listing Request)
Parser 1 - Character-by-Character): Average 9.363 seconds
Parser 2 - (Search & Destroy): Average 10.883
Okay - here we go. After several weeks of working on this off and on, I've trimmed it down and made it more of a generic JSON parser, here are the results:
You might be be questioning how the Yahoo stocks example was parsed faster than the DirecTV playlist file: it's because the Yahoo return has no white-space to skip (except for the CR/LFs for the chunking of the data when coming from the server.) This makes sense as running through each character is slow than searching for specific characters and performing mid_string. My tests were performed on using files rather than actually receiving the data from the servers since I wanted to eliminate that variable. So the attached has several ".json" files in it - put those in the main directory of the master to run the tests as I have.
I would love to see results from an NI-x100 and NI-x00 as well; if anyone improves on the code - please don't be selfish and share. Also, it's Rev. 8 that is the quickest; Rev 7 is what it was obviously built off of - that is probably the best revision to start with if anyone decides to improve upon it.
I have both an NI-3000 and NI-3100 on my desk. I will try to run the tests tomorrow.
NI-3000:
Again, this is only meant for a starting point in quick & effective JSON parsing; you could potentially use it as is, but I intend to add some features to it. With the source being all right here - anyone is free to make changes. (BTW - sorry for the lack of commenting! Sometimes my mind gets going so quick I don't have time to jot down what I'm doing.)
I misread "NI-X100 and NI-X00" as "NI-X100 and NI-X000".
I'm going to start writing a DTV module that'll parse the results into structures that would be easier to manage.
I was expecting these results to be slower than they are.
Anyway - attached is some code for DirecTV JSON parsing. I changed the format of the returned strings from the parser and are a bit unconventional as far as AMX standards go. I have the parsing working in the example DTV module for the playlist & program info responses. Should be easy enough to add more responses if needed. Also, because of the way DTV responds I had to move the status object to the beginning, that way the DTV module knows what kind of response it got while parsing the rest of the info. I did this in the Main source file. This example is by no means complete and the attached code assumes you (anyone using this) knows how to get complete results to send to the parser.
I think I'm about done with this and probably won't be taking it too much further until DirecTV solidifies what they're doing, so if anyone adds GUI to it, or makes it more complete - I'd be interested as I'm sure everyone else would be.
As always, feedback would be nice.
I did not want to start a new thread, since this seems like a GREAT one for parsing JSON. Is anyone familiar with how to send a string in a JSON format? Is it all asci characters.
Taking an example from http://en.wikipedia.org
If I wanted to send the following to an IP device....is it all asci string(s) plus 13,10s?
{
"firstName": "John",
"lastName": "Smith",
"age": 25,
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": 10021
},
"phoneNumber": [
{
"type": "home",
"number": "212 555-1234"
},
{
"type": "fax",
"number": "646 555-4567"
}
]
}
Sent from my Nexus S 4G using Tapatalk 2
What type of application are you looking to interact with? A protocol document would be helpful in this situation to better answer your questions. Using HTTP as the protocol is open ended since there are many different things that could (or could not) be required.
If you use Firefox, Live Headers is a great add-on to "sniff" the headers that are exchanged.
https://addons.mozilla.org/en-us/firefox/addon/live-http-headers/
Wireshark is another great tool to have, not just for HTTP but in general to make sure that your responses are correct.
http://www.wireshark.org/
I know it's been a while, but just wanted to share the results of running the tests on a new NX-4200.
Plus side is that it's open source, so feel free to hack away at it.
JJames,
This seems to be working pretty good in my little program, but I noticed something odd and I've gotten a system message I've never seen before.
It looks like you left in a couple of send string 0 for timing purposes with "START..." and "..END" in the module these BOTH show up in my diagnostic window before I start seeing the diagnostic messages I placed in my data event to receive the messages from the json module. it seems like the module chews through the whole json message and "ends" before the data event in the main program is sent one string. How do messages queue up between modules? how multithreaded are the new processors? ( this is running on a nx2200)
On a related note, as my json strings are getting longer I've started getting " (Reader= Writer=)- CMessagePipe::MAX = 25 messages in my diagnostic window. Any idea what it is? I'm getting more than 25 data events and nothing seems to be dropped, but I'm still testing.
That being said you should only send the PARSE command to the module when you are certain you have all of the data to be parsed. The json variable is shared between the main program and the module. So when you're done populating it, just call PARSE and the module should handle the rest.
do you know what the " (Reader= Writer=)- CMessagePipe::MAX = 25 messages is about? is there a setting I should change?
You can use "show buffers" to see where they're at. Everything should read zero for a normal system.
If you mean the rate at which they are displayed to you, there is:
- Preferences
- Diagnostics
- Diagnostics and Notifications Output Displays
Read X line(s) from the buffer every 1/4 second.
Granted this only helps if there's more lines coming in than it can display. (You can't change the speed)
Can you post the code? That way we can look for things to optimize.
In my XML module I have a CONTINUE command that the receiving master code has to send after it has saved each element into the structure, so that way the reader doesn't just fill up the interpreter queue trying to read the whole file in one shot. You'll probably need something like this to throttle the JSON reader. It's less fun to write an asynchronous file/stream reader, but it keeps you from losing data.