Commit 68f2736a authored by Matthew Wilcox (Oracle)'s avatar Matthew Wilcox (Oracle)
Browse files

mm: Convert all PageMovable users to movable_operations



These drivers are rather uncomfortably hammered into the
address_space_operations hole.  They aren't filesystems and don't behave
like filesystems.  They just need their own movable_operations structure,
which we can point to directly from page->mapping.

Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
parent 81218f80
Loading
Loading
Loading
Loading
+0 −4
Original line number Diff line number Diff line
@@ -252,9 +252,7 @@ prototypes::
	bool (*release_folio)(struct folio *, gfp_t);
	void (*free_folio)(struct folio *);
	int (*direct_IO)(struct kiocb *, struct iov_iter *iter);
	bool (*isolate_page) (struct page *, isolate_mode_t);
	int (*migratepage)(struct address_space *, struct page *, struct page *);
	void (*putback_page) (struct page *);
	int (*launder_folio)(struct folio *);
	bool (*is_partially_uptodate)(struct folio *, size_t from, size_t count);
	int (*error_remove_page)(struct address_space *, struct page *);
@@ -280,9 +278,7 @@ invalidate_folio: yes exclusive
release_folio:		yes
free_folio:		yes
direct_IO:
isolate_page:		yes
migratepage:		yes (both)
putback_page:		yes
launder_folio:		yes
is_partially_uptodate:	yes
error_remove_page:	yes
+0 −12
Original line number Diff line number Diff line
@@ -737,12 +737,8 @@ cache in your filesystem. The following members are defined:
		bool (*release_folio)(struct folio *, gfp_t);
		void (*free_folio)(struct folio *);
		ssize_t (*direct_IO)(struct kiocb *, struct iov_iter *iter);
		/* isolate a page for migration */
		bool (*isolate_page) (struct page *, isolate_mode_t);
		/* migrate the contents of a page to the specified target */
		int (*migratepage) (struct page *, struct page *);
		/* put migration-failed page back to right list */
		void (*putback_page) (struct page *);
		int (*launder_folio) (struct folio *);

		bool (*is_partially_uptodate) (struct folio *, size_t from,
@@ -930,11 +926,6 @@ cache in your filesystem. The following members are defined:
	data directly between the storage and the application's address
	space.

``isolate_page``
	Called by the VM when isolating a movable non-lru page.  If page
	is successfully isolated, VM marks the page as PG_isolated via
	__SetPageIsolated.

``migrate_page``
	This is used to compact the physical memory usage.  If the VM
	wants to relocate a page (maybe off a memory card that is
@@ -942,9 +933,6 @@ cache in your filesystem. The following members are defined:
	page to this function.  migrate_page should transfer any private
	data across and update any references that it has to the page.

``putback_page``
	Called by the VM when isolated page's migration fails.

``launder_folio``
	Called before freeing a folio - it writes back the dirty folio.
	To prevent redirtying the folio, it is kept locked during the
+10 −103
Original line number Diff line number Diff line
@@ -152,110 +152,15 @@ Steps:
Non-LRU page migration
======================

Although migration originally aimed for reducing the latency of memory accesses
for NUMA, compaction also uses migration to create high-order pages.
Although migration originally aimed for reducing the latency of memory
accesses for NUMA, compaction also uses migration to create high-order
pages.  For compaction purposes, it is also useful to be able to move
non-LRU pages, such as zsmalloc and virtio-balloon pages.

Current problem of the implementation is that it is designed to migrate only
*LRU* pages. However, there are potential non-LRU pages which can be migrated
in drivers, for example, zsmalloc, virtio-balloon pages.

For virtio-balloon pages, some parts of migration code path have been hooked
up and added virtio-balloon specific functions to intercept migration logics.
It's too specific to a driver so other drivers who want to make their pages
movable would have to add their own specific hooks in the migration path.

To overcome the problem, VM supports non-LRU page migration which provides
generic functions for non-LRU movable pages without driver specific hooks
in the migration path.

If a driver wants to make its pages movable, it should define three functions
which are function pointers of struct address_space_operations.

1. ``bool (*isolate_page) (struct page *page, isolate_mode_t mode);``

   What VM expects from isolate_page() function of driver is to return *true*
   if driver isolates the page successfully. On returning true, VM marks the page
   as PG_isolated so concurrent isolation in several CPUs skip the page
   for isolation. If a driver cannot isolate the page, it should return *false*.

   Once page is successfully isolated, VM uses page.lru fields so driver
   shouldn't expect to preserve values in those fields.

2. ``int (*migratepage) (struct address_space *mapping,``
|	``struct page *newpage, struct page *oldpage, enum migrate_mode);``

   After isolation, VM calls migratepage() of driver with the isolated page.
   The function of migratepage() is to move the contents of the old page to the
   new page
   and set up fields of struct page newpage. Keep in mind that you should
   indicate to the VM the oldpage is no longer movable via __ClearPageMovable()
   under page_lock if you migrated the oldpage successfully and returned
   MIGRATEPAGE_SUCCESS. If driver cannot migrate the page at the moment, driver
   can return -EAGAIN. On -EAGAIN, VM will retry page migration in a short time
   because VM interprets -EAGAIN as "temporary migration failure". On returning
   any error except -EAGAIN, VM will give up the page migration without
   retrying.

   Driver shouldn't touch the page.lru field while in the migratepage() function.

3. ``void (*putback_page)(struct page *);``

   If migration fails on the isolated page, VM should return the isolated page
   to the driver so VM calls the driver's putback_page() with the isolated page.
   In this function, the driver should put the isolated page back into its own data
   structure.

Non-LRU movable page flags

   There are two page flags for supporting non-LRU movable page.

   * PG_movable

     Driver should use the function below to make page movable under page_lock::

	void __SetPageMovable(struct page *page, struct address_space *mapping)

     It needs argument of address_space for registering migration
     family functions which will be called by VM. Exactly speaking,
     PG_movable is not a real flag of struct page. Rather, VM
     reuses the page->mapping's lower bits to represent it::

	#define PAGE_MAPPING_MOVABLE 0x2
	page->mapping = page->mapping | PAGE_MAPPING_MOVABLE;

     so driver shouldn't access page->mapping directly. Instead, driver should
     use page_mapping() which masks off the low two bits of page->mapping under
     page lock so it can get the right struct address_space.

     For testing of non-LRU movable pages, VM supports __PageMovable() function.
     However, it doesn't guarantee to identify non-LRU movable pages because
     the page->mapping field is unified with other variables in struct page.
     If the driver releases the page after isolation by VM, page->mapping
     doesn't have a stable value although it has PAGE_MAPPING_MOVABLE set
     (look at __ClearPageMovable). But __PageMovable() is cheap to call whether
     page is LRU or non-LRU movable once the page has been isolated because LRU
     pages can never have PAGE_MAPPING_MOVABLE set in page->mapping. It is also
     good for just peeking to test non-LRU movable pages before more expensive
     checking with lock_page() in pfn scanning to select a victim.

     For guaranteeing non-LRU movable page, VM provides PageMovable() function.
     Unlike __PageMovable(), PageMovable() validates page->mapping and
     mapping->a_ops->isolate_page under lock_page(). The lock_page() prevents
     sudden destroying of page->mapping.

     Drivers using __SetPageMovable() should clear the flag via
     __ClearMovablePage() under page_lock() before the releasing the page.

   * PG_isolated

     To prevent concurrent isolation among several CPUs, VM marks isolated page
     as PG_isolated under lock_page(). So if a CPU encounters PG_isolated
     non-LRU movable page, it can skip it. Driver doesn't need to manipulate the
     flag because VM will set/clear it automatically. Keep in mind that if the
     driver sees a PG_isolated page, it means the page has been isolated by the
     VM so it shouldn't touch the page.lru field.
     The PG_isolated flag is aliased with the PG_reclaim flag so drivers
     shouldn't use PG_isolated for its own purposes.
If a driver wants to make its pages movable, it should define a struct
movable_operations.  It then needs to call __SetPageMovable() on each
page that it may be able to move.  This uses the ``page->mapping`` field,
so this field is not available for the driver to use for other purposes.

Monitoring Migration
=====================
@@ -286,3 +191,5 @@ THP_MIGRATION_FAIL and PGMIGRATE_FAIL to increase.

Christoph Lameter, May 8, 2006.
Minchan Kim, Mar 28, 2016.

.. kernel-doc:: include/linux/migrate.h
+3 −57
Original line number Diff line number Diff line
@@ -19,9 +19,6 @@
#include <linux/stringify.h>
#include <linux/swap.h>
#include <linux/device.h>
#include <linux/mount.h>
#include <linux/pseudo_fs.h>
#include <linux/magic.h>
#include <linux/balloon_compaction.h>
#include <asm/firmware.h>
#include <asm/hvcall.h>
@@ -500,19 +497,6 @@ static struct notifier_block cmm_mem_nb = {
};

#ifdef CONFIG_BALLOON_COMPACTION
static struct vfsmount *balloon_mnt;

static int cmm_init_fs_context(struct fs_context *fc)
{
	return init_pseudo(fc, PPC_CMM_MAGIC) ? 0 : -ENOMEM;
}

static struct file_system_type balloon_fs = {
	.name = "ppc-cmm",
	.init_fs_context = cmm_init_fs_context,
	.kill_sb = kill_anon_super,
};

static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
			   struct page *newpage, struct page *page,
			   enum migrate_mode mode)
@@ -564,47 +548,13 @@ static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
	return MIGRATEPAGE_SUCCESS;
}

static int cmm_balloon_compaction_init(void)
static void cmm_balloon_compaction_init(void)
{
	int rc;

	balloon_devinfo_init(&b_dev_info);
	b_dev_info.migratepage = cmm_migratepage;

	balloon_mnt = kern_mount(&balloon_fs);
	if (IS_ERR(balloon_mnt)) {
		rc = PTR_ERR(balloon_mnt);
		balloon_mnt = NULL;
		return rc;
	}

	b_dev_info.inode = alloc_anon_inode(balloon_mnt->mnt_sb);
	if (IS_ERR(b_dev_info.inode)) {
		rc = PTR_ERR(b_dev_info.inode);
		b_dev_info.inode = NULL;
		kern_unmount(balloon_mnt);
		balloon_mnt = NULL;
		return rc;
	}

	b_dev_info.inode->i_mapping->a_ops = &balloon_aops;
	return 0;
}
static void cmm_balloon_compaction_deinit(void)
{
	if (b_dev_info.inode)
		iput(b_dev_info.inode);
	b_dev_info.inode = NULL;
	kern_unmount(balloon_mnt);
	balloon_mnt = NULL;
}
#else /* CONFIG_BALLOON_COMPACTION */
static int cmm_balloon_compaction_init(void)
{
	return 0;
}

static void cmm_balloon_compaction_deinit(void)
static void cmm_balloon_compaction_init(void)
{
}
#endif /* CONFIG_BALLOON_COMPACTION */
@@ -622,9 +572,7 @@ static int cmm_init(void)
	if (!firmware_has_feature(FW_FEATURE_CMO) && !simulate)
		return -EOPNOTSUPP;

	rc = cmm_balloon_compaction_init();
	if (rc)
		return rc;
	cmm_balloon_compaction_init();

	rc = register_oom_notifier(&cmm_oom_nb);
	if (rc < 0)
@@ -658,7 +606,6 @@ static int cmm_init(void)
out_oom_notifier:
	unregister_oom_notifier(&cmm_oom_nb);
out_balloon_compaction:
	cmm_balloon_compaction_deinit();
	return rc;
}

@@ -677,7 +624,6 @@ static void cmm_exit(void)
	unregister_memory_notifier(&cmm_mem_nb);
	cmm_free_pages(atomic_long_read(&loaned_pages));
	cmm_unregister_sysfs(&cmm_dev);
	cmm_balloon_compaction_deinit();
}

/**
+3 −58
Original line number Diff line number Diff line
@@ -29,8 +29,6 @@
#include <linux/rwsem.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/mount.h>
#include <linux/pseudo_fs.h>
#include <linux/balloon_compaction.h>
#include <linux/vmw_vmci_defs.h>
#include <linux/vmw_vmci_api.h>
@@ -1730,20 +1728,6 @@ static inline void vmballoon_debugfs_exit(struct vmballoon *b)


#ifdef CONFIG_BALLOON_COMPACTION

static int vmballoon_init_fs_context(struct fs_context *fc)
{
	return init_pseudo(fc, BALLOON_VMW_MAGIC) ? 0 : -ENOMEM;
}

static struct file_system_type vmballoon_fs = {
	.name           	= "balloon-vmware",
	.init_fs_context	= vmballoon_init_fs_context,
	.kill_sb        	= kill_anon_super,
};

static struct vfsmount *vmballoon_mnt;

/**
 * vmballoon_migratepage() - migrates a balloon page.
 * @b_dev_info: balloon device information descriptor.
@@ -1862,21 +1846,6 @@ static int vmballoon_migratepage(struct balloon_dev_info *b_dev_info,
	return ret;
}

/**
 * vmballoon_compaction_deinit() - removes compaction related data.
 *
 * @b: pointer to the balloon.
 */
static void vmballoon_compaction_deinit(struct vmballoon *b)
{
	if (!IS_ERR(b->b_dev_info.inode))
		iput(b->b_dev_info.inode);

	b->b_dev_info.inode = NULL;
	kern_unmount(vmballoon_mnt);
	vmballoon_mnt = NULL;
}

/**
 * vmballoon_compaction_init() - initialized compaction for the balloon.
 *
@@ -1888,33 +1857,15 @@ static void vmballoon_compaction_deinit(struct vmballoon *b)
 *
 * Return: zero on success or error code on failure.
 */
static __init int vmballoon_compaction_init(struct vmballoon *b)
static __init void vmballoon_compaction_init(struct vmballoon *b)
{
	vmballoon_mnt = kern_mount(&vmballoon_fs);
	if (IS_ERR(vmballoon_mnt))
		return PTR_ERR(vmballoon_mnt);

	b->b_dev_info.migratepage = vmballoon_migratepage;
	b->b_dev_info.inode = alloc_anon_inode(vmballoon_mnt->mnt_sb);

	if (IS_ERR(b->b_dev_info.inode))
		return PTR_ERR(b->b_dev_info.inode);

	b->b_dev_info.inode->i_mapping->a_ops = &balloon_aops;
	return 0;
}

#else /* CONFIG_BALLOON_COMPACTION */

static void vmballoon_compaction_deinit(struct vmballoon *b)
{
}

static int vmballoon_compaction_init(struct vmballoon *b)
static inline void vmballoon_compaction_init(struct vmballoon *b)
{
	return 0;
}

#endif /* CONFIG_BALLOON_COMPACTION */

static int __init vmballoon_init(void)
@@ -1939,9 +1890,7 @@ static int __init vmballoon_init(void)
	 * balloon_devinfo_init() .
	 */
	balloon_devinfo_init(&balloon.b_dev_info);
	error = vmballoon_compaction_init(&balloon);
	if (error)
		goto fail;
	vmballoon_compaction_init(&balloon);

	INIT_LIST_HEAD(&balloon.huge_pages);
	spin_lock_init(&balloon.comm_lock);
@@ -1958,7 +1907,6 @@ static int __init vmballoon_init(void)
	return 0;
fail:
	vmballoon_unregister_shrinker(&balloon);
	vmballoon_compaction_deinit(&balloon);
	return error;
}

@@ -1985,8 +1933,5 @@ static void __exit vmballoon_exit(void)
	 */
	vmballoon_send_start(&balloon, 0);
	vmballoon_pop(&balloon);

	/* Only once we popped the balloon, compaction can be deinit */
	vmballoon_compaction_deinit(&balloon);
}
module_exit(vmballoon_exit);
Loading