5/02/2016

A scientific approach for Mt. Whitney hiking (4) ~ Custom Program for the data analysis

Above POST CAMP


Detail data analysis
When I looked at the past tracking data for Mt. Whitney hiking, I can see an indication the moving pace is slowing down above more than 10,000 feet (or could be lower).  Even if I did not feel I had a symptom at that time, slowing down the moving pace is clear indication of a response for the high altitude.  While the moving speed is slowing down, the heart rate could stay a similar range.  Those can be read from the overall data on a web site which visualizes the tracking records.
The service from the GPS device vendor is not really designed for analyzing the data.   Therefore, it might be hard to get into deep and compare the other cases.  So I decided to write a simple program (script) to extract the raw data of the tracking records.  I think it helps to do further analysis.

Here is an example of Mt. Whitney hiking on August 14, 2015.

Plot from Garmin Connect
The above is the original data on Garmin Connect.   It shows the heart rate and the pace.  X-axis is distance.   It is not very clear that the moving speed (pace) is slowing down.  The heart rate is getting higher and stays over 150 [bpm].
If we reorganize the same data with different way, here is an example.   Calculating average for the speed and heart rate every 50m elevation.  Then, using the average data make a plot.
Moving speed Plot based on altitude
Heart Rate Plot based on altitude


I think we can see a better way to understand the pace and the heart rate during the hiking.
To do this I wrote a program to re-organize the data for such analysis.



Custom data extraction program (script)
There a couple things the program can manipulate the tracking records.
The first thing, at least my GPS device logs a record every several seconds.  It might be good for track the running or biking data records.   However, it might be too short for hiking data.   Because the moving speed is much slower than them.   Therefore, I think it is better to reorganize the data for plotting for hiking.  We can define a duration for each sample, then between the sampling data, the program can calculate average to minimize the error.  The GPS data might include some error, therefore, if we plot every single record, then we might see some of data might not be correct data.   Some of record is obviously out of the other data range and it might be an error record.  Averaging will help to minimize the impact of the error records.
The second, it is probably a good idea to define “resting”.   Because it is not really a big factor for running.   However, it is probably a good way to see the “resting” somehow in the analysis.  Because resting in hiking is one of thing people are always do and a part of their plan.  In a definition of resting in terms of the program implementation, it can define minimum moving speed to consider as moving.   If the moving speed is less than the speed, it will be considered as “resting”.   Then we can calculate the moving time and the resting time in an analysis.
The last one, the key location point.  It is not really important for running, however, for hiking some of key points might help to compare / analyze the data.  We can define some key locations on the way, such as a major peak, a branch point to / from the other trails, a major camping sites and etc.   If we can mark such location in the record, it helps to look at the data.   In the implementation, the program can import a list of key location information with a label (name), a latitude and a longitude.   Then we can define maximum error amount to consider a record is reasonably close enough to a location or not.   The latitude and the longitude of a location can be extracted from a map.
Since I am using GPS devices only from Garmin, so that the source data can be downloaded from Garmin Connect web site which supports GPX and TCX file format other than original tracking data from a device.  The original data is a binary format called “.fit” file.   The file format can be found in the internet.  However using GPX and TCX is much easy to handle, therefore, the program takes GPX and TCX format from the site.   The GPX is a standard format for GPS tracking data.   However, it seems that there are some extensions and version difference.   For now, the program only can support the file from Garmin Connect web site.
For those readers of my blog, I will share the python source code under GPL license.   You can download from my BOX.com folder and try it.



(To be continue)

No comments:

Post a Comment