VFS笔记.docx - 冰豆网

资源描述

VFS笔记.docx

《VFS笔记.docx》由会员分享，可在线阅读，更多相关《VFS笔记.docx（35页珍藏版）》请在冰豆网上搜索。

VFS笔记.docx

VFS笔记

VFS的初始化过程

→start_kernel（）

→vfs_caches_init（）

Fs\Dcache.c

→→mnt_init（）

Fs\Namespace.c

→→→init_rootfs（）

#注册rootfs文件系统

→→→init_mount_tree（）

→→→→do_kern_mount（“rootfs”,0,“rootfs”,NULL）

Fs\super.c

→→→→→vfs_kern_mount（）

→→→→→→alloc_vfsmnt（）

#分配并初始化vfsmount对象

→→→→→→→kmem_cache_zalloc（）

#为vfsmount分配内存

→→→→→→→mnt_alloc_id（）

#为vfsmount分配ID

＃vfsmount结构用来描述一个mount事件，它记录了被mount设备的设备节点，mount点，以及其它一些必要信息，所有的vfsmount通过通用链表连接在一起。

每个vfsmount都有自己独一无二的mnt_id。

vfsmount结构在访问被mount的文件系统时会用到。

→→→→→→rootfs->get_sb/rootfs_get_sb（）

#调用rootfs的get_sb函数，获得superblock对象

→→→→→→→get_sb_nodev（）

Fs\super.c

＃该函数就不继续往下跟了，其所做的事情就是分配sb，并初始化之。

＃因为我们这里所mount的rootfs并不存在实际的blockdevice，所以在初始化时调用了set_anon_super和ramfs_fill_super两个用于ramfs的回调函数。

＃从rootfs_get_sb（）和ramfs_get_sb（）的代码比较来看，rootfs是一种特殊的ramfs，只不过在mount时没有指定USER（使用了NOUSER参数）。

＃＃又看了一下，这个函数不往下跟还不行，因为vfs_cache_init的一大目的就是建立”/”根目录，而get_sb_nodev（）恰好就承担了这项工作。

→→→→→→→→s=sget（）

#该函数创建一个类型为rootfs的superblock对象，并使用set_anon_super回调函数对其进行初始化。

因为rootfs是没有对应的blockdevice存在的，所以在函数set_anon_super中，生成了ida_get_new---MKDEV生成一个虚拟设备，生成策略代码太多了，懒得看，无关大局。

→→→→→→→→ramfs_fill_super（）

→→→→→→→→→ramfs_get_inode（）

#函数首先调用ramfs_get_inode（sb）来根据sb生成新的inode对象，该inode就是根所对应的inode。

structinode*inode=new_inode（sb）;

。

inode->i_op=&ramfs_dir_inode_operations;

inode->i_fop=&simple_dir_operations;

→→→→→→→→→d_alloc_root（）

#然后调用d_alloc_root（）函数为根生成对应的dentry，代码比较简单，直接贴在这儿。

if（root_inode）{

staticconststructqstrname={.name="/",.len=1};

res=d_alloc（NULL,&name）;

if（res）{

res->d_sb=root_inode->i_sb;

res->d_parent=res;

d_instantiate（res,root_inode）;

}

＃至此，根”/”就已经建立起来了。

所以对”/”的操作会通过dentry→inode→I_op来调用ramfs_dir_inode_operations。

→→→→

ns->root=mnt;

mnt->mnt_ns=ns;

init_task.nsproxy->mnt_ns=ns;

get_mnt_ns（ns）;

root.mnt=ns->root;

root.dentry=ns->root->mnt_root;

set_fs_pwd（current->fs,&root）;

set_fs_root（current->fs,&root）;

#最后，init_mount_tree通过上述代码将刚生成的vfsmount对象加入到init_task中，那么之后生成的所有进程都将知道这个根。

VFS－－如何从路径到具体文件

我们以在用户空间调用open系统调用为例，当进程试图打开文件时，会进入函数。

→do_sys_open（）

longdo_sys_open（intdfd,constchar__user*filename,intflags,intmode）

{

char*tmp=getname（filename）;

intfd=PTR_ERR（tmp）;

if（!

IS_ERR（tmp））{

fd=get_unused_fd_flags（flags）;

if（fd>=0）{

structfile*f=do_filp_open（dfd,tmp,flags,mode）;

if（IS_ERR（f））{

put_unused_fd（fd）;

fd=PTR_ERR（f）;

}else{

fsnotify_open（f->f_path.dentry）;

fd_install（fd,f）;

}

putname（tmp）;

}

returnfd;

}

→→getname（）

函数首先调用getname（）在内核空间中为name字串分配空间，并调用strncpy_from_user（）将name从用户空间拷贝到之前分配好的内核空间内。

→→get_unused_fd_flags（）

接着调用get_unused_fd_flags（）函数为该文件分配一个为使用的文件描述符fd。

分配过程主要是调用alloc_fd（）通过对file_struct→fdt的位操作来完成。

→→do_filp_open（）

如果分配到正确的文件描述符，则调用do_filp_open（）寻找文件，建立对应的file对象。

Do_flip_open（）函数是open动作的核心，其实现比较复杂。

*Thesimplestcase-justaplainlookup.

if（!

（flag&O_CREAT））{

error=path_lookup_open（dfd,pathname,lookup_flags（flag）,

&nd,flag）;

if（error）

returnERR_PTR（error）;

gotook;

}

函数首先判断open是传入的参数，如果没有Create参数，则直接进行path_lookup_open

→→→path_lookup_open（）

→→→→get_empty_flip（）

首先调用get_empty_flip（）来分配一个新的file指针。

在分配之前，get_empty_flip函数会判断当前进程所打开的文件的总数是否超过允许的最大值：

NR_FILE,8192

*Privilegeduserscangoabovemax_files

if（get_nr_files（）>=files_stat.max_files&&!

capable（CAP_SYS_ADMIN））{

*percpu_countersareinaccurate.Doanexpensivecheckbefore

*wegoandfail.

}

→→→→do_path_loop（）

structfs_struct*fs=current->fs;

函数首先得到当前进程的fs_struct。

if（*name=='/'）{

read_lock（&fs->lock）;

nd->path=fs->root;//获得当前文件系统的root

path_get（&fs->root）;//rootentry的引用加1

read_unlock（&fs->lock）;

}elseif（dfd==AT_FDCWD）{

read_lock（&fs->lock）;

nd->path=fs->pwd;

path_get（&fs->pwd）;

read_unlock（&fs->lock）;

}else{

判断用户路径是否以”/”开始，如果是，则获得当前fs的rootpath，并

如果不是则获得当前fs的pwdpath。

→→→→→path_walk（）

→→→→→→link_path_walk（）

→→→→→→→_link_path_walk（）

该函数涉及到具体的walk细节，所以相对较长。

while（*name=='/'）

name++;

if（!

*name）

gotoreturn_reval;

首先将传入Path的’/’去掉，因为在之前已经判断了Path是以root还是pwd开始的，并且对应的structPath已存入nameidata,作为参数传入此函数。

for（;;）{

unsignedlonghash;

structqstrthis;

unsignedintc;

接下来进入函数的主体，一个大的for循环，每次循环主要做以下几件事：

1，对当前entry->I_node的权限检查

2,从pathname中分解出文件目录名。

举例说明，传入”/dev/sda1”，”/”已在之前被清理，那么这次分解出来的就是dev，下一次分解出来的就是sda1

3，在parententry中寻找本次分解出来的文件目录名，如果在”/”所对应的entry中寻找dev，在dev对应的entry中寻找sda1。

寻找会首先在dentryhash中，如果没有找到再到实际的文件系统中去。

进入具体的代码。

nd->flags|=LOOKUP_CONTINUE;

err=exec_permission_lite（inode）;

if（err==-EAGAIN）

err=inode_permission（nd->path.dentry->d_inode,

MAY_EXEC）;

if（err）

break;

这部分进行权限检查，细节不做深究。

this.name=name;

c=*（constunsignedchar*）name;

hash=init_name_hash（）;

do{

name++;

hash=partial_name_hash（c,hash）;

c=*（constunsignedchar*）name;

}while（c&&（c!

='/'））;

this.len=name-（constchar*）this.name;

this.hash=end_name_hash（hash）;

this是qstr类型，也就是传说中的quickstring，包含string的原始值，长度，和hash，在这里对pathname进行hash，主要是避免进行string比较的操作。

上面这段代码生成qstr，将会在之后遇到。

上面的代码将这次要检索的文件名放入this，并生成hash。

/*removetrailingslashes?

if（!

c）

gotolast_component;

while（*++name=='/'）;

if（!

*name）

gotolast_with_slashes;

检查this字串是否是目标文件名或目标文件夹名，如果是，跳转到last_component或last_with_slashed，如果不是，继续往下走。

我们先看后一种情况。

*"."and".."arespecial-".."especiallysobecauseithas

*tobeabletoknowaboutthecurrentrootdirectoryand

*parentrelationships.

if（this.name[0]=='.'）switch（this.len）{

default:

break;

case2:

if（this.name[1]!

='.'）

break;

follow_dotdot（nd）;

inode=nd->path.dentry->d_inode;

/*fallthrough*/

case1:

continue;

}

这段代码处理的是两种特殊情况：

.和..，如果是前者，表示当前目录，那么continue，如果是..，则表示要回到上层目录，进入follow_dotdot（），将nd中的dentry赋为上层目录的dentry。

具体过程不做跟踪。

/*Thisdoestheactuallookups..*/

err=do_lookup（nd,&this,&next）;

if（err）

break;

→→→→→→→→do_lookup（）

进入核心的查找部分。

此函数的代码较短，我们将其全部贴出来：

structvfsmount*mnt=nd->path.mnt;

structdentry*dentry=__d_lookup（nd->path.dentry,name）;

if（!

dentry）

gotoneed_lookup;

if（dentry->d_op&&dentry->d_op->d_revalidate）

gotoneed_revalidate;

__d_lookup（）函数在当前进程（应该是内核中所有已打开的dentry）已打开的dentryhash中查找，

→→→→→→→→→__d_lookup（）

structhlist_head*head=d_hash（parent,hash）;

rcu_read_lock（）;

hlist_for_each_entry_rcu（dentry,node,head,d_hash）{

structqstr*qstr;

__link_path_walk

if（dentry->d_name.hash!

=hash）

continue;

if（dentry->d_parent!

=parent）

continue;

spin_lock（&dentry->d_lock）;

*Recheckthedentryaftertakingthelock-d_movemayhave

*changedthings.Don'tbothercheckingthehashbecausewe're

*abouttocomparethewholenameanyway.

if（dentry->d_parent!

=parent）

gotonext;

/*non-existingduetoRCU?

if（d_unhashed（dentry））

gotonext;

*Itissafetocomparenamessinced_move（）cannot

*changetheqstr（protectedbyd_lock）.

qstr=&dentry->d_name;

if（parent->d_op&&parent->d_op->d_compare）{

if（parent->d_op->d_compare（parent,qstr,name））

gotonext;

}else{

if（qstr->len!

=len）

gotonext;

if（memcmp（qstr->name,str,len））

gotonext;

}

atomic_inc（&dentry->d_count）;

found=dentry;

spin_unlock（&dentry->d_lock）;

break;

spin_unlock（&dentry->d_lock）;

}

rcu_read_unlock（）;

函数首先使用d_hash（）函数从系统dentry_hashtable中找到parent,hash所对应的list（hash冲突？

），然后对该list遍历，对hash值，parent，namestring等几项依次比较，如果找到则退出遍历，将找到的dentry->d_count值加一，表示引用者增加了一个，返回dentry。

注意这里使用两种类型的锁来保护临界数据，在遍历之前使用rcu_read_lock（）来保护dentrylist，这是一种小开销的读写锁。

当访问到每个dentry数据结构时，使用了spin_lock来保护dentry数据。

我们注意到在spin_lock前后各进行了一次d_parent的比较。

前一次的比较的目的在于及早排除不正确的dentry，后一次doublecheck是防止在在获得spinlock的过程中，dentry被d_move（）从hash中移动。

！

精细之极阿。

。

→→→→→→→→do_lookup（）continue

从__do_lookup（）返回后，如果没有找到正确的dentry，则跳转到need_look，进入具体的文件系统查找，否则判断找到的dentry是否需要revalidate，如果需要，则进入need_revalidate校验目标文件是否还真实存在。

当要访问的文件是在网络之上的话，则使用need_revalidate来判断网络文件系统是否还存在。

其他情况则进入done

done:

path->mnt=mnt;

path->dentry=dentry;

__follow_mount（path）;

return0;

这段代码是为了处理目录被作为其他文件系统的mount点的情况。

→→→→→→→→→__follow_mount（）

intres=0;

while（d_mountpoint（path->dentry））{

structvfsmount*mounted=lookup_mnt（path->mnt,path->dentry）;

if（!

mounted）

break;

dput（path->dentry）;

if（res）

mntput（path->mnt）;

path->mnt=mounted;

path->dentry=dget（mounted->mnt_root）;

res=1;

}

returnres;

如果一个dentry被作为另一个文件系统的mount点，其dentry->d_mounted会被置位。

这里使用d_mountpoint检查d_mounted，如果被置位，则通过look_up函数查找mount文件系统的根节点，并将dentry赋为mnt_root。

注意这里使用的是while，而不是if，因为被mount上的文件系统的根可能又作为了其他文件系统的mount点，因此必须遍历知道dentry->d_mounted为0。

→→→→→→→→→→lookup_mnt（）

structlist_head*head=mount_hashtable+hash（mnt,dentry）;

structlist_head*tmp=head;

structvfsmount*p,*found=NULL;

for（;;）{

tmp=dir?

tmp->next:

tmp->prev;

p=NULL;

if（tmp==head）

break;

p=list_entry（tmp,structvfsmount,mnt_hash）;

if（p->mnt_parent==mnt&&p->mnt_mountpoint==dentry）{

found=p;

break;

}

returnfound;

系统中所有mount信息都会在内核中hash，mount_hashtable就是入口点。

lookup_mount从mount_hashtable中找到mnt和entry对应的list，遍历，找到目标vfsmount。

→→→→→→→→do_lookup（）continue

返回到need_lookup（）。

开始处理没有在dentry_hashtable中找到目标entry的情况。

need_lookup:

dentry=real_lookup（nd->path.dentry,name,nd）;

if（IS_ERR（dentry））

gotofail;

gotodone;

→→→→→→→→→real_lookup（）

mutex_lock（&dir->i_mutex）;

result=d_lookup（parent,name）;

首先加锁保护，然后再次调用d_lookup（）在dentry_hashtable中查找一次，因为在等待mutex锁时，该dentry的cache可能被建立。

if（!

result）{

structdentry*dentry;

/*Don'tcreatechilddentryforadeaddirectory.*/

result=ERR_PTR（-ENOENT）;

if（IS_DEADDIR（dir））

gotoout_unlock;

dentry=d_alloc（parent,name）;

result=ERR_PTR（-ENOMEM）;

展开阅读全文