Jump to content
Main menu
Main menu
move to sidebar
hide
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Special pages
Niidae Wiki
Search
Search
Appearance
Create account
Log in
Personal tools
Create account
Log in
Pages for logged out editors
learn more
Contributions
Talk
Editing
Ext3
(section)
Page
Discussion
English
Read
Edit
View history
Tools
Tools
move to sidebar
hide
Actions
Read
Edit
View history
General
What links here
Related changes
Page information
Appearance
move to sidebar
hide
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
===No checksumming in journal=== ext3 does not do [[checksum]]ming when writing to the journal. On a storage device with extra cache, if ''barrier=1'' is not enabled as a mount option (in [[/etc/fstab]]), and if the hardware is doing out-of-order write caching, one runs the risk of severe filesystem corruption during a crash.<ref name="archives.free">[http://archives.free.net.ph/message/20070518.134838.52e26369.en.html Re: Frequent metadata corruption with ext3 + hard power-off] {{Webarchive|url=https://web.archive.org/web/20070928031902/http://archives.free.net.ph/message/20070518.134838.52e26369.en.html |date=2007-09-28 }}. Archives.free.net.ph. Retrieved on 2013-06-22.</ref><ref>[http://archives.free.net.ph/message/20070519.014256.ac3a2e07.en.html Re: Frequent metadata corruption with ext3 + hard power-off] {{Webarchive|url=https://web.archive.org/web/20070928031908/http://archives.free.net.ph/message/20070519.014256.ac3a2e07.en.html |date=2007-09-28 }}. Archives.free.net.ph. Retrieved on 2013-06-22.</ref><ref>Red Hat Enterprise Linux, [https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Storage_Administration_Guide/writebarr.html ''Chapter 20. Write Barriers'']</ref> This is because storage devices with write caches report to the system that the data has been completely written, even if it was written to the (volatile) cache. If hard disk writes are done out-of-order (due to modern hard disks caching writes in order to [[amortized analysis|amortize]] write speeds), it is likely that one will write a commit block of a transaction before the other relevant blocks are written. If a power failure or unrecoverable crash should occur before the other blocks get written, the system will have to be rebooted. Upon reboot, the file system will replay the log as normal, and replay the "winners" (transactions with a commit block, including the invalid transaction above, which happened to be tagged with a valid commit block). The unfinished disk write above will thus proceed, but using corrupt journal data. The file system will thus mistakenly overwrite normal data with corrupt data while replaying the journal. If checksums had been used, where the blocks of the "fake winner" transaction were tagged with a mutual checksum, the file system could have known better and not replayed the corrupt data onto the disk. Journal checksumming has been added to ext4.<ref>[http://article.gmane.org/gmane.linux.file-systems/21373 ext4: Add the journal checksum feature]. Article.gmane.org (2008-02-26). Retrieved on 2013-06-22.</ref> Filesystems going through the device mapper interface (including software [[RAID]] and LVM implementations) may not support barriers, and will issue a warning if that mount option is used.<ref>[http://oss.sgi.com/archives/xfs/2007-12/msg00080.html Re: write barrier over device mapper supported or not?] {{Webarchive|url=https://web.archive.org/web/20090504120507/http://oss.sgi.com/archives/xfs/2007-12/msg00080.html |date=2009-05-04 }}. Oss.sgi.com. Retrieved on 2013-06-22.</ref><ref>[http://madduck.net/blog/2006.08.11:xfs-zeroes/ XFS and zeroed files] {{Webarchive|url=https://web.archive.org/web/20080430221349/http://madduck.net/blog/2006.08.11:xfs-zeroes/ |date=2008-04-30 }}. Madduck.net (2008-07-11). Retrieved on 2013-06-22.</ref> There are also some disks that do not properly implement the write cache flushing extension necessary for barriers to work, which causes a similar warning.<ref>[https://web.archive.org/web/20110727154012/http://forums.opensuse.org/archives/sls-archives/suse-linux/desktop-environments/379681-barrier-sync.html Barrier Sync]. forums.opensuse.org (March 2007)</ref> In these situations, where barriers are not supported or practical, reliable write ordering is possible by turning off the disk's write cache and using the {{code|1=data=journal}} mount option.<ref name="archives.free" /> Turning off the disk's write cache may be required even when barriers are available. Applications like databases expect a call to [[sync (Unix)|fsync()]] to flush pending writes to disk, and the barrier implementation doesn't always clear the drive's write cache in response to that call.<ref>[http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg272253.html Re: Proposal for "proper" durable fsync() and fdatasync()]. Mail-archive.com (2008-02-26). Retrieved on 2013-06-22.</ref> There is also a potential issue with the barrier implementation related to error handling during events, such as a drive failure.<ref>[http://www.mjmwired.net/kernel/Documentation/block/barrier.txt I/O Barriers, as of kernel version 2.6.31]. Mjmwired.net. Retrieved on 2013-06-22.</ref> It is also known that sometimes some [[virtualization]] technologies do not properly forward fsync or flush commands to the underlying devices (files, volumes, disk) from a guest operating system.<ref>[http://www.mysqlperformanceblog.com/2011/03/21/virtualization-and-io-modes-extra-complexity/ Virtualization and IO Modes = Extra Complexity]. Mysqlperformanceblog.com (2011-03-21). Retrieved on 2013-06-22.</ref> Similarly, some hard disks or controllers implement cache flushing incorrectly or not at all, but still advertise that it is supported, and do not return any error when it is used.<ref>[http://www.mysqlperformanceblog.com/2009/03/02/ssd-xfs-lvm-fsync-write-cache-barrier-and-lost-transactions/ SSD, XFS, LVM, fsync, write cache, barrier and lost transactions]. Mysqlperformanceblog.com (2009-03-02). Retrieved on 2013-06-22.</ref> There are so many ways to handle fsync and write cache handling incorrectly, it is safer to assume that cache flushing does not work unless it is explicitly tested, regardless of how reliable individual components are believed to be.
Summary:
Please note that all contributions to Niidae Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Encyclopedia:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Search
Search
Editing
Ext3
(section)
Add topic