Tuesday, October 31, 2017

The Case of the Four Missing Characters

Alternatively: Initialize Your Variables

So it's been almost a year since I've used this. Honestly, I though after ENTR 390 was over, I wouldn't have much else to say. And I didn't - until I ran into something else interesting. Oh well. I guess the 0 readers of the internet will get to hear about my splurges.
(BTW if I should take this down please do tell)

Setting the Scene

The class in question is EECS 370. Introduction to Computer Organization, probably the most hardware-oriented class in the Computer Science major. It's a different way of thinking, but adaptation is survival.

The project in question concerns linking and loading: compiling assembly files into object files, and linking multiple object files together to form a machine code "executable." This executable is then backwards-compatible with the simulation program written for the previous project.

Linking Woes

Linking by itself seems straightforward: copy over the text and data sections, then use the relocation table to figure out which fields need to be changed, and use the symbol table to resolve global ones, and some math to resolve local ones.

And indeed, that's what my code does. After a few minor headaches here and there with the math involved with locals, I've got my code down to a tee. Just a single test case, which is what they give us in the spec, but that's okay; the linker gives the same answer. A simple submission should do the trick...

Hint: the following student test cases exposed the student linker as buggy: countdown
Wait a minute, that can't be right. I manually linked it and assembled that with the INSTRUCTOR solution! And the machine code matches! What gives?

Sleepness Nights

Ok, it wasn't that dramatic. But this issue did bother me a bunch. What's going on here? Here's a hint from now:
But it works on my machine!
But obviously past me doesn't know that.  Instead, I venture towards another solution:

More test cases! Obviously more tests should catch more bugs, right? Indeed, I did end up catching more buggy solutions with my new tests:

  • Multiply. Specifically, the smart multiplication algorithm from the previous project, which chooses the proper multiplier and multiplicand to minimize instructions executed. Here, I broke the choosing part of the algorithm into its own subroutine.
  • Times4: Another test case found in the spec, which simply uses a subroutine to multiply a number by 4.
  • Combination: The third part of the project, which involved writing a function that computes n choose r. The function was also seeded with a driver file, which specifies it compute 4 choose 2.
I also caught an additional bug with the linker, where I forgot to consider the "Stack" label when resolving globals. It's gotta be fine now, right?
Hint: the following student test cases exposed the student linker as buggy: combination countdown multiply times4
Okay. Something's fishy here. I personally linked all these files manually and assembled them with the previous project's INSTRUCTOR solution, and the resulting machine code files were identical (apart from the final 0 I used as a sentinel for the label "Stack," but in both cases it points to the same place)! What the heck is happening?

A diff of the manually linked Combination and the Linker Combination, showing the extra 0 used as the "Stack" label being the only difference


It Works on My Machine!

Sadly, not every platform is the same. I finally had had enough of banging my head against the Windows wall. Maybe it was different on Linux?

So I decided to compile my code on Linux and run it, and voila! There's an anomaly!
The Windows and Linux Results are Different! But what could cause this?
After a bit of tweaking the debug output, I saw something quite odd:
Wait a minute... Why is AllDataStart at THAT Value?
A bit of digging through my code, and the answer is obvious: I had never set allDataStart = 0; One simple mistake had cost so much time and effort. All for four missing characters. Not even, in fact. Because two would suffice.

Fixed?

To keep things short and sweet, yes. Initializing one variable that I had neglected to do before had fixed the issue. And my recurring headaches.

As for why? I'm not too sure. Not initializing the values has undefined behaviour, of course, but on Windows it seems to always have been initialized properly, to 0. In Linux, the environment the grader runs in, that doesn't seem to be the case. A cryptic value of 32768 comes up: not entirely sure why. It could have started as 32764, then gained a value of 4 as it is supposed to, but 32764 is such an odd value. Maybe, in its uninitialized woes, Linux decides to have mercy and initialize it to a nice value?

I don't know. Go ask someone who knows this better than me. I'm just happy this is finished,

Friday, December 9, 2016

We Done?

We've reached an unfortunate point now. Lab is over, and soon the entire semester will be as well. It's been fun.

Well, here's what we ended up creating in these 2-or-so-odd months.

Unfortunately Blogger is not very good with videos.

Monday, December 5, 2016

The Solutions to your Issues Lie in the Most Obscure Places

Memory Management Revisited

So remember the issues that we were having with the Photon and its memory management? Turns out C++ is pretty hard. One simple mistyped character caused a cascade of instability.

One Character?

Yep. One character.

The model we used was essentially an operating system for the Photon, with the input device being a 5 way button thingy and the output device being the screen. The system itself was composed of "input responders," a rough classification that is used for anything that responds to user input accordingly.

This model made use of both Menus and Info screens. The menus consisted of both the location menu, which gives the location of the device using Google's Geolocation API, and several first aid menus, act as collections of responses to first aid situations. Then come the info screens, which give users the information they need to respond to various situations.

The issue we were experiencing arose with the first aid menus. Though we had passed a parameter for the number of different situations on each menu, we happened to forget that while drawing the screen. Instead, we used a constant number 5, which happened to be more than the number of situations. And because this is C++, the Photon attempts to access data that it shouldn't. Sometimes, some other data was there, which resulted in the random gibberish that sometimes popped up everywhere.

A simple fix made the device stable. Or, well, almost. There was another issue that impeded progress.

Be the Better Programmer

This issue arose from my attempt to be a good programmer. Whenever I compiled the code, the compiler always warned me about the deprecated conversion from Particle's String to a char array. So I decided to create a method that would solve this issue and convert it without the trouble.

As luck would have it, this method caused more trouble than it solved. In the conversion process, random characters were added on for some reason. Perhaps it was in the length method. or perhaps it was something else. But one way or another, these random characters were tacked on.

It was perhaps these random characters that also caused similar issues as the ones we had solved beforehand. The library we used for the Nokia 5110, our LCD, made use of a bitmap for each character in order to store how the characters should be rendered. But given how these random characters were not in the range of that array, once again, the Photon begins accessing garbage data once again. Best case scenario, it renders as blank. Common scenario, it renders as random data.

Turns out that, like most deprecated things, support still exists. So instead of going through the conversion ourselves, make the Photon do it for us. It works just fine that way.

So why the Crashing?

To be honest, I can't tell. My best guess is that either the bitmap or the array was allocated near the end of memory, so accessing out-of-bounds data resulted in attempting to access memory that doesn't exist. And the Photon responds by crashing.

Can't blame it.

Of course, this is my best guess. Perhaps someone at Particle could explain this much better than I ever could.

Monday, November 28, 2016

Memory Management Fun

Flash?

There comes a time in every programmer's life when they ask themselves, "how much memory is my code using?" Now is that time.

Background: what are we doing?

Being a first aid kid addon, the logical addition to our product is a way for people to find out first aid information in order to help people in trouble. So, of course, we decided to go and make essentially a new operating system for the Photon to run off of. What could go wrong?

C++ is Fun (this is a lie)

Though it's quite widely known and I do have a bit of experience, C++ still posed quite a few challenges in programming. To put it bluntly, I'm still working on it. The inclusion of libraries and a lot of other factors have made this project quite a bit more complicated than I would have liked, but this is the life that I've chosen.

Crashing and Memory Management

The Photon and its firmware are both quite fickle, and the slightest perturbation has caused countless crashes in our testing. Code that works fine one moment crashes the next, sometimes without any changes, though rarely in such a case. Much more common is random crashing caused by removal of code that has nothing to do with the cause of the crash!

When I was first prototyping the code, I decided to include an abbreviation field to store a menu's abbreviation. But I never used this field, and so I believed that I could free up some critical space by getting rid of this space. No dice, however, as the program crashed as soon as I pressed any buttons. No debug output could explain it; and at this point this memory-eating feature still exists.

Another fun bug came with changing the information that our device provided. I noticed that some characters were being cut off, so I decided to fix it in my code. And again, it just decided to crash randomly?

This issue is very confusing, and the only explanation I possibly have is that the method of allocating memory within the code is not allocating enough. After all, I am using an array of char pointer pointers as a 2 dimensional array, so I do not know the specifics of how the Photon manages its memory. Thus, attempting to read the memory at a certain point may exceed the bounds of the array and read other information that is complete gibberish.

But it this were the case, there still remains a puzzling bug. The location menu currently has no functionality built into its center button. In the final product, pressing the center button should call the Geolocation methods we worked so tediously on before and then display the results to the screen. But right now, we are still prototyping this "Operating System" so that functionality is not included. Despite this, pressing the button appears to be a toggle of some sort. Specifically, toggling whether some gibberish appears on the menu or not. We added tracing calls to see whether some other subclasses' method was being called instead, but it was not. The code within the method itself is just the tracing output.

Color me surprised. These random crashes happen for no apparent reason, either. It's quite frustrating trying to develop in such an environment. Perhaps it would be better to try and develop something similar for a computer instead, just to try and flesh out the concept.

Thursday, November 10, 2016

Video Editing is Hard

In Our Best Interests

To not watch the proof of concept video. But if you insist, you can check it out here. Because I don't want to upload the video again.

Also it uses flash. Did we just get out of a time machine a decade in the past?

Broken Code and Broken Dreams

Push to Master

Don't do it if your code doesn't work. Please. And if someone opens an issue, please work on resolving it.

Sorry

I had to get that rant out of my system, I realize that many of these Particle library devs are volunteers who do what they do out of love. So I'd like to take a moment and that all the people working on Open Source projects right now. Y'all the real MVPs.

Cutting it Close

Our proof of concept's requirements weren't really that stringent. That being said, I did want to add as much functionality as possible. In fact, I made an entire post about getting the Geolocation to work. What didn't make the cut into the proof of concept is somewhat more interesting.

Hardware that Failed

Our Nokia 5110 LCD came with a part that we had trouble identifying at first. Turns out, it's a Texas Instruments CD4050BE, a fancy part that made wiring much trickier. In fact, we weren't entirely sure what its purpose was. With a little digging, we found out that the LCD worked just fine without this part. And sure enough, it did. So now there's a CD4050BE just sitting around in our design lab. In case anyone needs it.

The other component we were playing around with was pulsesensor.com's (appropriately enough) pulse sensor. In our demo testing of the hardware, we were able to get it working just fine. But that was 2 weeks before our proof of concept demo. We were content with it working, and so decided to move onto the LCD and Geolocation, figuring that getting the pulse sensor working would be easy enough.

Turns out, it's not that simple. In the two weeks that had passed, the author of the code released an update. Appropriately enough, our code broke. It was time to go hunting for a reason. Fortunately, Particle's online IDE's libraries make use of Github. The repository was just a click away. And checking the changelogs and the Readme gave us an answer as to why it was broken. Previous versions had included a dependency library along with the library code. The latest version separated them, so that it was necessary to include both libraries in the code.

So we went back and did that and... nothing. Instead of complaining about being unable to find a necessary library, it was complaining about an error within the library itself. Quite puzzling indeed.

We still had a working demo with example code, though. So we booted it up, and sure enough it worked just fine. It was running version 1.5.0, while the newest version was 1.5.1.. So instead of including 1.5.1 in our proof of concept code, we decided to go with 1.5.0. After all, the example code worked fine. Why wouldn't the rest of it work?

Well, for some bizarre and unknown reason, that code broke too. Same exact error as the 1.5.1 library, even though the example code worked fine. Our demo code with the 1.5.0 code still worked, and so did reusing the example code for 1.5.0. But when we tried to migrate the Pulse Sensor code to our proof of concept, it broke every single time.

At this point, we had already invested nearly an hour of time trying to get the pulse sensor to work. No matter what we tried, it simply refused to. So instead of wasting more time, we decided to cut it. For the proof of concept, we only really needed a location and a screen display that location. So we decided to work on those for a change.

Hardware that didn't quite Fail

To be quite honest, hardware and software are both hard. And for us in the Hardware IoT section that involves programming our devices, getting both to work together is even more difficult. Case in point - another many years of time used getting the screen to work. Aside from cutting out that weird Texas Instruments part, we did have to do quite a bit of work to get it working. And the other hard part about it - if it's not working 100%, it won't do anything.

Actually, that was a lie. The backlight controls and the screen control are separate. Aside from that, however, we had zero feedback as to where our errors were.

And speaking of errors, there were quite a few of them getting the screen to work. Our Photon currently has a significant amount of its pins in use just for the screen. Misplace any one of them, and the entire thing fails and nobody knows why. In fact, we broke it multiple times over the course of testing. Mostly, it was accidentally pulling out a wire. But sometimes it was trickier., We spent a good 20 minutes trying to get the screen working once, only to find out that somehow a wire had moved from A3 to A4. Resetting it quickly fixed the issue.

One Grand Realization

Experience really does matter in this line of work. There are so many different things that all have to be working in perfect harmony. Even though I have quite a bit of experience coming into this class, I've still been stuck on problems for many hours that turn out to have simple solutions. And it's not because I refuse to learn from my past mistakes. No, it's because there are so many possible mistakes to make, that it becomes impossible to not make one eventually. Even the most experienced of engineers will end up hooking up a wire incorrectly. Some may spend hours in their current state trying to diagnose the issue. Others may decided to simply tear down their machine and start over.

Sometimes I feel like I should do the latter.

Friday, November 4, 2016

Occam was Right

Overengineering at its Finest

From my time browsing the internets, I've come to hear one piece of engineering advice over and over and over again.

Don't overengineer stuff.

Well, it's kind of too late for that.

Geolocation is hard

With the work on the rest of the proof of concept going smoothly and ahead of schedule, I decided to work on something that wasn't explicitly in the proof of concept, but would be helpful for a more complete project: Geolocation.

The idea is that Geolocation provides an easy way for first responders to locate the rescuer and victims, especially in less-than-ideal situations such as earthquakes with rubble everywhere. The final product relies on a GPS module, which is built into the Particle Electron, but not the Proton. In disaster situations, GPS should be much more reliable, and it also comes with Particle's built-in library, AssetTracker. But GPS modules cost $40, which we already spent on other components. That wasn't an option, nor was throwing down $70 to buy an Electron and Particle's data plan. Instead, I decided to rely on Wifi access points and Google's Geolocation API.

Why-fi? Get it?

Well, because it was basically the only option we had. We're broke college kids who can't afford GPS. So basically, Wifi is the only option we're left with. So that's what we're gonna go with, and hope that it works.

The Quest to Query Google

It's a hard day for two of us freshies who have no experience with web APIs at all. I had to obtain an API key from Google in order to use their services. The free key allows me to make up to 50 requests per second and 2500 total requests per day. I highly doubt I will reach this limit any time soon, though for completed product it may be necessary to pay the $.50 for 1000 additional requests, up to 100,000 per day. Or maybe even upgrade to the premium plan. But, of course, it is highly doubtful that we will still be using Wifi access points to determine our location when GPS is available and much more accurate.

For the prototype, though, this is about as reliable as it gets. That presents another challenge: how to actually make a query to the service. Oh boy. Our inexperience really showed here. We ran through quite a few services to try and get this working, and ended up completely ditching all of them. Yep.

IFTTT

My first attempted solution was IFTTT - IF This Then That, a simple IoT "recipe" maker that allows certain action events to trigger certain responses, as is in the name. Particle provided a handy channel that allowed us to listen for changes in variables, function return values, events, and even the device status. I decided to start with an event listener. After all, it should be the least tedious to get working, right?

Well, there were a couple problems. The action trigger in IFTTT required that the contents were equal to some arbitrary parameter, so our solution would be impossible to implement via IFTTT's Particle Event listener. Perhaps, then, the variable solution would be better.

But by using the variable listener, the solution becomes even more convoluted. We can say that the variable value is not equal to something, but then, what is stopping IFTTT from using up all our daily allotment of API calls in the blink of an eye? If we make IFTTT respond by firing an event of to set the value of the variable back to 0, we can theoretically fix it. But then there are even more things flying around. Not good.

Not to mention that querying Google's Geolocation API and returning the result was not really easy, either. IFTTT provides the Maker channel to make web API queries, which is exactly what we were after. But with our inexperience, it quickly became obvious that this was not the solution we were after. Sure, we could get the request. But how would we listen for the response? The Maker channel required an event name, but what would that be? We were absolutely clueless (and to be fair, I still am). So we decided that IFTTT was not going to work.

Noodl

In lab on the 28th of October, Simon introduced Noodl to us, a simple-to-use (Compared to writing raw code) prototyping tool that had Javascript features built-in. So I decided to try and figure out how to get it to try and query Google's Geolocation API. But like anything, it wasn't exactly straightforward.

The example code included about a few hundred lines of unformatted Javascript code that was required in order to interact with the Particle cloud. And once that was done, it required even more work to try and get an HTTP POST request to Google's API and process the response. Needless to say, it didn't quite work out as planned.

Sure, we ended up with a nice button that said "geolocate," but that button or the Javascript couldn't do everything necessary. Which was disappointing.

Thingspeak

It was at this point that I stumbled upon a post on the Particle Forums that did about exactly what we needed it to. So I decided to look up the basis of this post was. Turns out, it was Particle's own system to create webhooks. So I booted up the tutorials they provided and followed the steps they gave.

Those steps included setting something up with Thingspeak. So I did that and ended up relaying the information from the Photon to the Particle Cloud and then finally to Thingspeak. At Thingspeak, I was successful in creating a POST request to the Geolocation API. Almost there!

The only issue now was to try and process the response. I had to create a listener for the response and then create an action handler to forward that information back to the Particle cloud. The only problem with that was, well, all of it. I didn't know how to do that whatsoever. And despite me playing around in Lab for about 2 hours to try and fix it, I couldn't get it.

Give up?

Nope. If you find yourself in a pickle, back up a few steps to try and see why you're in the pickle in the first place. So I did just that, and I realized something. I didn't need a 3rd party cloud service at all. Particle's own Webhooks were powerful and simple enough to do just what I needed. Thus, I resolved to create a Webhook that would connect the Photon to Google and get its location.

Webhooks

Finally, the solution is known. No 3rd party cloud services necessary. A simple Webhook is enough to relay all the necessary information to determine a device's location. The only issue is that the Webhook creation process wasn't exactly straightforward. Once again, I was left fiddling with the system for a while before figuring out how it worked at all. And, to be fair, I still don't understand exactly how it works.

Creating a webhook can be done in two ways. With a proper JSON file, you can do it using Particle's Command Line Interface. The web interface also works and is more user-friendly, but at the same time leaves users with so many options that they should be overwhelmed. At least I was.

What's more, webhooks currently cannot be edited. If you mess up, you have to delete it and start over again. This is an issue.

Moment of Truth

Would it even work, though? Well, as they say, there's only one way to find out.

I flashed the Photon with the firmware then went to the Particle App. Sure enough, the function scanWifiAPs showed up as a callable function.

I opened up the Serial monitor to read the data I would be receiving.

I called the function and waited.

The data in the format to send to Google showed up.

And nothing.

What have I done wrong?

Turns out that Google's Geolocation API is quite fickle. A quick check of the Logs showed the reason - Response code 400, which means that either
  • My API key is faulty
  • The response body is incorrect.
Except neither of those was the issue. I just called my method again, and it returned something!

Did I mess up?

I got a location at approximately 40N 80W. I plugged those numbers into Google Maps, and...

I'm in the middle of nowhere. The nearest city is Pittsburgh. Close, but not quite.

Call me insane, but I tried the exact same thing again.

Google Why

Nope, it works this time. I am now consistently placed within 40m of the middle of the Duderstadt Center, which is where I have been testing. I have no clue why any of those errors are happening. I blame Google.

Words Unspoken by any Engineer

"It works!"

Of course, all this work is only for a proof of concept at the moment. For a final product, the Particle Electron and its AssetTracker library provides easy wrappers around its built-in Adafruit GPS module, which can obtain a device's latitude and longitude with ease.