blk_update_request I/O error, dev sda, WRITE SAME failed. Manually zeroing.内核错误提示解决方案
在服务器上安装完CentOS 系统,启动过程非常慢,一直刷新下面的错误提示
根分区是ext4格式,底层是使用MegaRAID构建的Raid1系统盘。出现上面上面的问题,初步怀疑是raid驱动问题或者是磁盘硬件问题,更新不同版本的megaraid_sas的驱动,以及更换硬盘都不能解决该问题。
CentOS 7.5使用的3.16版本的内核,上述错误内核源码部分如下:
/**
* blkdev_issue_zeroout - zero-fill a block range
* @bdev: blockdev to write
* @sector: start sector
* @nr_sects: number of sectors to write
* @gfp_mask: memory allocation flags (for bio_alloc)
*
* Description:
* Generate and issue number of bios with zerofiled pages.
*/
int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
sector_t nr_sects, gfp_t gfp_mask)
{
if (bdev_write_same(bdev)) {
unsigned char bdn[BDEVNAME_SIZE];
if (!blkdev_issue_write_same(bdev, sector, nr_sects, gfp_mask,
ZERO_PAGE(0)))
return 0;
bdevname(bdev, bdn);
pr_err("%s: WRITE SAME failed. Manually zeroing.\n", bdn);
}
return __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask);
}
EXPORT_SYMBOL(blkdev_issue_zeroout);
继续跟踪下blkdev_issue_write_same函数的实现情况
/**
* blkdev_issue_write_same - queue a write same operation
* @bdev: target blockdev
* @sector: start sector
* @nr_sects: number of sectors to write
* @gfp_mask: memory allocation flags (for bio_alloc)
* @page: page containing data to write
*
* Description:
* Issue a write same request for the sectors in question.
*/
int blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
sector_t nr_sects, gfp_t gfp_mask,
struct page *page)
{
DECLARE_COMPLETION_ONSTACK(wait);
struct request_queue *q = bdev_get_queue(bdev);
unsigned int max_write_same_sectors;
struct bio_batch bb;
struct bio *bio;
int ret = 0;
if (!q)
return -ENXIO;
max_write_same_sectors = q->limits.max_write_same_sectors;
if (max_write_same_sectors == 0)
return -EOPNOTSUPP;
atomic_set(&bb.done, 1);
bb.flags = 1 << BIO_UPTODATE;
bb.wait = &wait;
while (nr_sects) {
bio = bio_alloc(gfp_mask, 1);
if (!bio) {
ret = -ENOMEM;
break;
}
bio->bi_iter.bi_sector = sector;
bio->bi_end_io = bio_batch_end_io;
bio->bi_bdev = bdev;
bio->bi_private = &bb;
bio->bi_vcnt = 1;
bio->bi_io_vec->bv_page = page;
bio->bi_io_vec->bv_offset = 0;
bio->bi_io_vec->bv_len = bdev_logical_block_size(bdev);
if (nr_sects > max_write_same_sectors) {
bio->bi_iter.bi_size = max_write_same_sectors << 9;
nr_sects -= max_write_same_sectors;
sector += max_write_same_sectors;
} else {
bio->bi_iter.bi_size = nr_sects << 9;
nr_sects = 0;
}
atomic_inc(&bb.done);
submit_bio(REQ_WRITE | REQ_WRITE_SAME, bio);
}
/* Wait for bios in-flight */
if (!atomic_dec_and_test(&bb.done))
wait_for_completion_io(&wait);
if (!test_bit(BIO_UPTODATE, &bb.flags))
ret = -ENOTSUPP;
return ret;
}
EXPORT_SYMBOL(blkdev_issue_write_same);
逻辑这就清晰了,触发WRITE SAME操作的是这个变量 max_write_same_sectors, max_write_same_blocks非零时才会通过调用WRITE SAME命令实现清零操作。
查看了我们自己的物理机/sys/devices/pci0000:00/0000:00:02.2/0000:02:00.0/host0/target0:2:0/0:2:0:0/scsi_disk/0:2:0:0/max_write_same_blocks为非0值。
和REDHAT上的bug反馈信息看,是驱动支持该特性,底层硬件设备不支持WRITE SAME操作。
那解决方案就简单了,在系统启动过程中,将该值切换为0值。
不过底层硬件不支持该特性, 驱动为什么没有正确发现该设置呢?初始化/sys下配置的时候为什么就没有初始化为0值?这个问题待解。