Matlab - Parsing Data Files With Header Content
25 Oct 2007 Daniel Sutoyo 7 comments 8431 views
Part 1: Preparing Data Set to be Read Into Matlab
Often times in Matlab we would like to read data from a *.txt file. The importdata function is convenient as long as you don’t have text or an inconsistent number of columns in your data set. However, if you’re dealing with large volumes of data, it is inconvenient to delete the header content by hand. One option is to use php to automatically remove the contents in all your files. However, most data sets that are generated from data acquisition systems will usually put header content that provide key information on measurement parameters such as: number of points, sampling rate, etc. Thus, we need to be able to extract this key information as well! In this tutorial I will show you how you can easily read in data files with header content.
Suppose your data file looks something along the lines of
1 2 3 4 5 6 7 8 9 10 11 | n := 9 d := 5 k := 4 param p:1 2 3 4 5 := 1 0.4 0.5 0.6 0.12 0.6 2 0.4 0.5 0.6 0.12 0.6 3 0.1 0.2 0.4 0.22 0.3 4 0.2 0.3 0.3 0.12 0.6 5 0.3 0.4 0.2 0.42 0.4 6 0.2 0.5 0.6 0.12 0.6 ... data set continues |
n = number of data points, d = number of data columns, and k = whatever other parameters you might need. You can read in the data using importdata on the input file as it is, but be warned that Matlab will not store the data as you would expect. Matlab will read in the data and store it like a matrix and treat the whole thing as a matrix, including your text! So the way to get around this is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | % Read in File id = fopen('filename.txt'); % Read in Header Content for i=1:3 readin = fgetl(id); para(i) = str2num(readin(6:length(readin))); end fgetl(id); % Read in Data for i=1:para(1) data(i,:) = str2num(fgetl(id)); end fclose('all') |
Don’t worry, I’ll explain how the code works now.
id = fopen('filename.txt');
The function fopen creates a file identifier (numerical name tag) that Matlab uses for its functions. It essentially “opens” the file within Matlab, allowing Matlab to start reading in the file. The file is read in line by line using the function fgetl. You can remeber it as “get line”. To use the fgetl function, just call the function on the file identifier value of the input file. In this case, the file identifier is id. Each time you call the fgetl function, Matlab will automatically move onto the next line in the file, so that you don’t keep reading in the same line!
7 Responses to “Matlab - Parsing Data Files With Header Content”
Leave a Reply
Include MATLAB code in your comment by doing the following:
<pre lang="MATLAB">
%insert code here
</pre>

Also, if you don’t know how many rows your numerical data set consists of, you can use a while loop instead of a for loop.
r = 1
while 1
readin = fgetl(id);
if readin == -1 break; end
data(r,:) = str2num(readin);
r = r+1
end
I can only say
You Made My day
thanx a lot
Hi,
your matlab help is super useful for me. thank you.
Q: i have a file that has a row of text every 1000 lines of numeric data. looks like that:
text
int1
int2
…
int1000
text
int1
int2
…
int1000
text
etc etc.
if there a way to read the last line of text (it has useful information) and get numeric data. well, i’m sure there is a way, i’m wondering if you know how to do it.
thank you
for the previous post-
i forgot to mention a few things.
1. the last line of text is not at the very end of the file
2. the original file is a .txt file
Misha,
the examples in the tutorial should lay out the framework you need… Since you already know there are 1000 numeric lines, this makes it straightforward
nblocks = # of txt + 1000 lines (if you don’t know this, change this to outer loop to while loop, and make it true when there is no empty lines read in)
for j=1:nblocks
1. call fgetl to read in text ( I don’t know how many lines or how your text are)
2. Then call 1000 times to read in numeric data
end
You can switch the order around if necessary. If you want your code to be more dynamic, you can always add ‘if’ statements to check if the content are text headers or numerical numbers.
hi
thanks for the previous postings…very helpful.
however, i have some data in a .txt file. i don’t know when the data row finish (i.e. i don’t know which row is the last row!) and i have some lines of text in between every (for instance) 10 or 20 rows of data. could you please help me with that?
cheerZ
behzad
behzad–
I’m doing something very similar, where the # of lines of numbers between headers is variable. What I do is use str2double to check if the current line is data or a string. I also put in state variables to keep track of where to put the data, ignore whitespace and empty lines, etc., but this is the core of it.
while ~feof(fid) % go until EOF
line = fgetl(fid);
if isnan(str2double(line)) % it’s a text header
disp(line); % do something with it
else % else, it’s data
disp(str2double(line)); % do something
end
end
Or if you need to input matrices, not just doubles and ints, use str2num:
while ~feof(fid) % go until EOF
line = fgetl(fid);
[x status] = str2num(line);
if ~status % it’s the text header
disp(line); % do something with it
else % else, it’s data
disp(x); % do something
end
end
Cheers!