ROS语音控制机器人教程.docx

资源描述

ROS语音控制机器人教程.docx

《ROS语音控制机器人教程.docx》由会员分享，可在线阅读，更多相关《ROS语音控制机器人教程.docx（16页珍藏版）》请在冰豆网上搜索。

ROS语音控制机器人教程.docx

ROS语音控制机器人教程

如今语音识别在PC机和智能手机上炒的火热，ROS走在技术的最前沿当然也不会错过这么帅的技术。

ROS中使用了CMUSphinx和Festival开源项目中的代码，发布了独立的语音识别包，而且可以将识别出来的语音转换成文字，然后让机器人智能处理后说话。

一、语音识别包

1、安装

安装很简单，直接使用ubuntu命令即可，首先安装依赖库：

$sudoapt-getinstallgstreamer0.10-pocketsphinx

$sudoapt-getinstallros-fuerte-audio-common

$sudoapt-getinstalllibasound2

然后来安装ROS包：

其中的核心文件就是nodes文件夹下的recognizer.py文件了。

这个文件通过麦克风收集语音信息，然后调用语音识别库进行识别生成文本信息，通过/recognizer/output消息发布，其他节点就可以订阅该消息然后进行相应的处理了。

2、测试

安装完成后我们就可以运行测试了。

首先，插入你的麦克风设备，然后在系统设置里测试麦克风是否有语音输入。

然后，运行包中的测试程序：

$roslaunchpocketsphinxrobocup.launch

此时，在终端中会看到一大段的信息。

尝试说一些简单的语句，当然，必须是英语，例如：

bringmetheglass，comewithme，看看能不能识别出来。

我们也可以直接看ROS最后发布的结果消息：

$rostopicecho/recognizer/output

二、语音库

1、查看语音库

这个语音识别时一种离线识别的方法，将一些常用的词汇放到一个文件中，作为识别的文本库，然后分段识别语音信号，最后在库中搜索对应的文本信息。

如果想看语音识别库中有哪些文本信息，可以通过下面的指令进行查询：

$roscdpocketsphinx/demo

$morerobocup.corpus

2、添加语音库

我们可以自己向语音库中添加其他的文本识别信息，《rosbyexample》自带的例程中是带有语音识别的例程的，而且有添加语音库的例子。

首先看看例子中要添加的文本信息：

$roscdrbx1_speech/config

$morenav_commands.txt

这就是需要添加的文本，我们也可以修改其中的某些文本，改成自己需要的。

然后我们要把这个文件在线生成语音信息和库文件，然后在线编译生成库文件。

把下载的文件都解压放在rbx1_speech包的config文件夹下。

我们可以给这些文件改个名字：

$roscdrbx1_speech/config

$rename-f's/3026/nav_commands/'*

在rbx1_speech/launch文件夹下看看voice_nav_commands.launch这个文件：

output="screen">

可以看到，这个launch文件在运行recognizer.py节点的时候使用了我们生成的语音识别库和文件参数，这样就可以实用我们自己的语音库来进行语音识别了。

通过之前的命令来测试一下效果如何吧：

$roslaunchrbx1_speechvoice_nav_commands.launch

$rostopicecho/recognizer/output

三、语音控制

有了语音识别，我们就可以来做很多犀利的应用了，首先先来尝试一下用语音来控制机器人动作。

1、机器人控制节点

前面说到的recognizer.py会将最后识别的文本信息通过消息发布，那么我们来编写一个机器人控制节点接收这个消息，进行相应的控制即可。

在pocketsphinx包中本身有一个语音控制发布Twist消息的例程voice_cmd_vel.py，rbx1_speech包对其进行了一些简化修改，在nodes文件夹里可以查看voice_nav.py文件：

/usr/bin/envpython

importroslib;roslib.load_manifest（'rbx1_speech'）

importrospy

fromgeometry_msgs.msgimportTwist

fromstd_msgs.msgimportString

frommathimportcopysign

classVoiceNav:

def__init__（self）:

rospy.init_node（'voice_nav'）

rospy.on_shutdown（self.cleanup）

#Setanumberofparametersaffectingtherobot'sspeed

self.max_speed=rospy.get_param（"~max_speed",0.4）

self.max_angular_speed=rospy.get_param（"~max_angular_speed",1.5）

self.speed=rospy.get_param（"~start_speed",0.1）

self.angular_speed=rospy.get_param（"~start_angular_speed",0.5）

self.linear_increment=rospy.get_param（"~linear_increment",0.05）

self.angular_increment=rospy.get_param（"~angular_increment",0.4）

#Wedon'thavetorunthescriptveryfast

self.rate=rospy.get_param（"~rate",5）

r=rospy.Rate（self.rate）

#Aflagtodeterminewhetherornotvoicecontrolispaused

self.paused=False

#InitializetheTwistmessagewewillpublish.

self.cmd_vel=Twist（）

#PublishtheTwistmessagetothecmd_veltopic

self.cmd_vel_pub=rospy.Publisher（'cmd_vel',Twist）

#Subscribetothe/recognizer/outputtopictoreceivevoicecommands.

rospy.Subscriber（'/recognizer/output',String,self.speech_callback）

#Amappingfromkeywordsorphrasestocommands

self.keywords_to_command={'stop':

['stop','halt','abort','kill','panic','off','freeze','shutdown','turnoff','help','helpme'],

'slower':

['slowdown','slower'],

'faster':

['speedup','faster'],

'forward':

['forward','ahead','straight'],

'backward':

['back','backward','backup'],

'rotateleft':

['rotateleft'],

'rotateright':

['rotateright'],

'turnleft':

['turnleft'],

'turnright':

['turnright'],

'quarter':

['quarterspeed'],

'half':

['halfspeed'],

'full':

['fullspeed'],

'pause':

['pausespeech'],

'continue':

['continuespeech']}

rospy.loginfo（"Readytoreceivevoicecommands"）

#Wehavetokeeppublishingthecmd_velmessageifwewanttherobottokeepmoving.

whilenotrospy.is_shutdown（）:

self.cmd_vel_pub.publish（self.cmd_vel）

r.sleep（）

defget_command（self,data）:

#Attempttomatchtherecognizedwordorphrasetothe

#keywords_to_commanddictionaryandreturntheappropriate

#command

for（command,keywords）inself.keywords_to_command.iteritems（）:

forwordinkeywords:

ifdata.find（word）>-1:

returncommand

defspeech_callback（self,msg）:

#Getthemotioncommandfromtherecognizedphrase

command=self.get_command（msg.data）

#Logthecommandtothescreen

rospy.loginfo（"Command:

"+str（command））

#Iftheuserhasaskedtopause/continuevoicecontrol,

#settheflagaccordingly

ifcommand=='pause':

self.paused=True

elifcommand=='continue':

self.paused=False

#Ifvoicecontrolispaused,simplyreturnwithout

#performinganyaction

ifself.paused:

return

#Thelistofif-thenstatementsshouldbefairly

#self-explanatory

ifcommand=='forward':

self.cmd_vel.linear.x=self.speed

self.cmd_vel.angular.z=0

elifcommand=='rotateleft':

self.cmd_vel.linear.x=0

self.cmd_vel.angular.z=self.angular_speed

elifcommand=='rotateright':

self.cmd_vel.linear.x=0

self.cmd_vel.angular.z=-self.angular_speed

elifcommand=='turnleft':

ifself.cmd_vel.linear.x!

=0:

self.cmd_vel.angular.z+=self.angular_increment

else:

self.cmd_vel.angular.z=self.angular_speed

elifcommand=='turnright':

ifself.cmd_vel.linear.x!

=0:

self.cmd_vel.angular.z-=self.angular_increment

else:

self.cmd_vel.angular.z=-self.angular_speed

elifcommand=='backward':

self.cmd_vel.linear.x=-self.speed

self.cmd_vel.angular.z=0

elifcommand=='stop':

#Stoptherobot!

PublishaTwistmessageconsistingofallzeros.

self.cmd_vel=Twist（）

elifcommand=='faster':

self.speed+=self.linear_increment

self.angular_speed+=self.angular_increment

ifself.cmd_vel.linear.x!

=0:

self.cmd_vel.linear.x+=copysign（self.linear_increment,self.cmd_vel.linear.x）

ifself.cmd_vel.angular.z!

=0:

self.cmd_vel.angular.z+=copysign（self.angular_increment,self.cmd_vel.angular.z）

elifcommand=='slower':

self.speed-=self.linear_increment

self.angular_speed-=self.angular_increment

ifself.cmd_vel.linear.x!

=0:

self.cmd_vel.linear.x-=copysign（self.linear_increment,self.cmd_vel.linear.x）

ifself.cmd_vel.angular.z!

=0:

self.cmd_vel.angular.z-=copysign（self.angular_increment,self.cmd_vel.angular.z）

elifcommandin['quarter','half','full']:

ifcommand=='quarter':

self.speed=copysign（self.max_speed/4,self.speed）

elifcommand=='half':

self.speed=copysign（self.max_speed/2,self.speed）

elifcommand=='full':

self.speed=copysign（self.max_speed,self.speed）

ifself.cmd_vel.linear.x!

=0:

self.cmd_vel.linear.x=copysign（self.speed,self.cmd_vel.linear.x）

ifself.cmd_vel.angular.z!

=0:

self.cmd_vel.angular.z=copysign（self.angular_speed,self.cmd_vel.angular.z）

else:

return

self.cmd_vel.linear.x=min（self.max_speed,max（-self.max_speed,self.cmd_vel.linear.x））

self.cmd_vel.angular.z=min（self.max_angular_speed,max（-self.max_angular_speed,self.cmd_vel.angular.z））

defcleanup（self）:

#Whenshuttingdownbesuretostoptherobot!

twist=Twist（）

self.cmd_vel_pub.publish（twist）

rospy.sleep

（1）

if__name__=="__main__":

try:

VoiceNav（）

rospy.spin（）

exceptrospy.ROSInterruptException:

rospy.loginfo（"Voicenavigationterminated."）

可以看到，代码中定义了接收到各种命令时的控制策略。

2、仿真测试

和前面一样，我们在rviz中进行仿真测试。

首先是运行一个机器人模型：

$roslaunchrbx1_bringupfake_turtlebot.launch

然后打开rviz：

$rosrunrvizrviz-d`rospackfindrbx1_nav`/sim_fuerte.vcg

如果不喜欢在终端里看输出，可以打开gui界面：

$rxconsole

再打开语音识别的节点：

$roslaunchrbx1_speechvoice_nav_commands.launch

最后就是机器人的控制节点了：

$roslaunchrbx1_speechturtlebot_voice_nav.launch

OK，打开上面这一堆的节点之后，就可以开始了。

可以使用的命令如下：

下图是我的测试结果，不过感觉准确度还是欠佳：

四、播放语音

现在机器人已经可以按照我们说的话行动了，要是机器人可以和我们对话就更好了。

ROS中已经集成了这样的包，下面就来尝试一下。

运行下面的命令：

$rosrunsound_playsoundplay_node.py

$rosrunsound_playsay.py"GreetingsHumans.Takemetoyourleader."

有没有听见声音！

ROS通过识别我们输入的文本，让机器人读了出来。

发出这个声音的人叫做kal_diphone，如果不喜欢，我们也可以换一个人来读：

$sudoapt-getinstallfestvox-don

$rosrunsound_playsay.py"Welcometothefuture"voice_don_diphone

哈哈，这回换了一个人吧，好吧，这不也是我们的重点。

在rbx1_speech/nodes文件夹中有一个让机器人说话的节点talkback.py:

/usr/bin/envpython

importroslib;roslib.load_manifest（'rbx1_speech'）

importrospy

fromstd_msgs.msgimportString

fromsound_play.libsoundplayimportSoundClient

importsys

classTalkBack:

def__init__（self,script_path）:

rospy.init_node（'talkback'）

rospy.on_shutdown（self.cleanup）

#SetthedefaultTTSvoicetouse

self.voice=rospy.get_param（"~voice","voice_don_diphone"）

#Setthewavefilepathifused

self.wavepath=rospy.get_param（"~wavepath",script_path+"/../sounds"）

#Createthesoundclientobject

self.soundhandle=SoundClient（）

#Waitamomenttolettheclientconnecttothe

#sound_playserver

rospy.sleep

（1）

#Makesureanylingeringsound_playprocessesarestopped.

self.soundhandle.stopAll（）

#Announcethatwearereadyforinput

self.soundhandle.playWave（self.wavepath+"/R2D2a.wav"）

rospy.sleep

（1）

self.soundhandle.say（"Ready",self.voice）

rospy.loginfo（"Sayoneofthenavigationcommands..."）

#Subscribetotherecognizeroutputandsetthecallbackfunction

rospy.Subscriber（'/recognizer/output',String,self.talkback）

deftalkback（self,msg）:

#Printtherecognizedwordsonthescreen

rospy.loginfo（msg.data）

#Speaktherecognizedwordsintheselectedvoice

self.soundhandle.say（msg.data,self.voice）

#Uncommenttoplayoneofthebuilt-insounds

#rospy.sleep

（2）

#self.soundhandle.play（5）

#Uncommenttoplaya

展开阅读全文