matlab中textscan如何实现包含空格的格式读取?

发布网友发布时间：2022-07-07 02:33

共2个回答

热心网友时间：2023-10-09 05:29

textscan的用法
用法 1 ： C = textscan(fid, 'format', N, 'param', value)
用法 2 ： C = textscan(str, 'format', N, 'param', value)
注意是两种不同的情况，一个是文件即fid，另外一个是string

首先是string,
例 str = '0.41 8.24 3.57 6.24 9.27';
c = textscan(str,'%3.1f');
c{1,1}
ans =
0.4000
1.0000
8.2000
4.0000
3.5000
7.0000
6.2000
4.0000
9.2000
7.0000
"%3.1f表示"每次读3个字符，小数点后
C = textscan(str, '%3.1f %*1d'); 结果 C{1} = [0.4; 8.2; 3.5; 6.2; 9.2]
C = textscan(str, '%3.1f %*1u'); 结果 C{1} = [0.4; 8.2; 3.5; 6.2; 9.2]
C = textscan(str, '%3.1f'); 结果 C{1} = [0.4; 1.0；8.2; 4.0；3.5; 7.0；6.2; 4.0； 9.2；7.0 ]
C = textscan(str, '%2.1f %*1u'); 结果 C{1} = [0 1.0000 0.2000 3.0000 7.0000 0.2000 9.0000 7.0000]
C = textscan(str, '%2.1f %1u'); 注意结果包含两组 C{1} = [0 1.0000 0.2000 3.0000 7.0000 0.2000 9.0000 7.0000]
C{2} = [4 8 4 5 6 4 2]
例2 读取不同类型数据
生成文件'scan1.dat'，文件内容如下：
09/12/2005 Level1 12.34 45 1.23e10 inf Nan Yes 5.1+3i
10/12/2005 Level2 23.54 60 9e19 -inf 0.001 No 2.2-.5i
11/12/2005 Level3 34.90 12 2e5 10 100 No 3.1+.1i

fid = fopen('scan1.dat');
C = textscan(fid, '%s %s �2 � %u %f %f %s %f');
fclose(fid);
注意：每输入一个“%s”或者其他“�2”等，产生的C会多一组
C{1} = {'09/12/2005'; '10/12/2005'; '11/12/2005'} class cell
C{2} = {'Level1'; 'Level2'; 'Level3'} class cell
C{3} = [12.34; 23.54; 34.9] class single
C{4} = [45; 60; 12] class int8
C{5} = [4294967295; 4294967295; 200000] class uint32
注意：文件中的9e19或者1.23e10，要远远大于%u的范围，%u是整数，最大值为4294967295
C{6} = [Inf; -Inf; 10] class double
C{7} = [NaN; 0.001; 100] class double
C{8} = {'Yes'; 'No'; 'No'} class cell
C{9} = [5.1+3.0i; 2.2-0.5i; 3.1+0.1i] class double
例3 空缺值赋值
生成文件data2.csv，文件内容如下：

abc, 2, NA, 3, 4
// Comment Here
def, na, 5, 6, 7

fid = fopen('data2.csv');
C = textscan(fid, '%s %n %n %n %n', 'delimiter', ',', ...
'treatAsEmpty', {'NA', 'na'}, ...
'commentStyle', '//');
fclose(fid);
因为存在5个类似"%s"的输出，所以C有5组
C{1} = {'abc'; 'def'}
C{2} = [2; NaN]
C{3} = [NaN; 5]
C{4} = [3; 6]
C{5} = [4; 7]

注意：假如使用textscan读fid，即某个文件，则每textscan一次，fid会往后推，即下一次textscan会在上一次textscan后的位置开始，而对string进行textscan，则每次textscan都是从第一个字母开始读取，假如想每次读string不从开头开始，则需要使用两个输出变量控制。
例4 lyric = 'Blackbird singing in the dead of night'
[firstword, pos] = textscan(lyric,'�', 1);
lastpart = textscan(lyric(pos+1:end), '%s');

注意以下两种区别：
lyric = 'Blackbird singing in the dead of night'
[firstword, pos] = textscan(lyric,'�', 2);
firstword{1}结果为“Blackbird
singing i”
[firstword, pos] = textscan(lyric,'�', 2);
firstword{1}结果为“Blackbir
d singing”
"�"时，读完9个字符，刚好遇到空格，所以读下一个9个字符，直接从s读取，但是“�”时，从d直接读，后面的空格也作为字符读取了。

lastpart = textscan(lyric(pos+1:end), '%s');

注意：假如文件data.txt内数据如下：
1,1,null,2,2
1,2,2,null,2
读取过程如下：
fid = fopen('data.txt','r');
C = textscan(fid, '%n','delimiter',',','treatAsEmpty','null','HeaderLines',1);
fclose( fid ); clear fid ans
结果如下：
C{1}(1:10) = [1 1 NaN 2 2 1 2 2 NaN 2]
但是如果命令如下：
C = textscan(fid, '%n','delimiter',',','treatAsEmpty','null','HeaderLines',1);
" %n "变为 " %u " 或者 " %d "，则上文结果中的NaN变为0

例
Using a text editor, create a file grades.txt that contains
Student_ID | Test1 | Test2 | Test3
1 91.5 89.2 77.3
2 88.0 67.8 91.0
3 76.3 78.1 92.5
4 96.4 81.2 84.6

C_text = textscan(fid, '%s', 4, 'delimiter', '|');

C_data1 = textscan(fid, '%d %f %f %f', ...
'CollectOutput', 1)
注意用'collectOutput'时，相同属性数据放在一起，例如' %d单独放一列，而其余的3个%f放在一起'

热心网友时间：2023-10-09 05:29

你的意思我还是不太懂，我大概理解一下是不是这样的：

%% 扫描字符串2
clear
clc
str = '1985 112 -10.53';
%将替换为0
A = find(str == 32);
str(A) = 48;
%下面这这一句相当于+198501120-10.53
%不是你给的+19850112-010.53
%第二个空格在负号前面，你怎么第一个位置对，第二个往后跑了一个
str2num(str)

另外我再给你一个textscan扫描字符串的例子，看对你有没有帮助：

%% 使用textscan扫描字符串中的数据
clc
str_1 = 'The number is 1 2 3 4 5';
%首先使用textscan获取第一个前14个字符
[str1,position1] = textscan(str_1,'%14c',1);
str1{:}; %The number is
position1; %14
%获取字符串的长度
[temp1,temp2] = size(str_1);
%然后读取后面的数字字符串
str_2 = textscan(str_1(position1+1:temp2),'%9c',1);
%将字符串转化为数值
num = str2num(str_2{1})

追问谢谢答复，第一种做法我也考虑过，但是正如您指出的，它无法自动识别为-010.53. (我要读取的数据格式就是这样的)。我想使用textscan是因为我看重了它的速度，所以暂时没有考虑更多次的替换功能。不过受您的启发，我计划试试先全部读取为字符串再替换的方式，测试一下速度。。
再次感谢~