Deep Copy and Shallow Copy in Python

Deep Copy and Shallow Copy in Python

Description:
In Python programming, when we need to copy an object (especially a composite object, such as a list or dictionary that contains other objects), we encounter the concepts of shallow copy and deep copy. Understanding the difference between them is crucial to avoid potential errors in programs. Simply put, a shallow copy only duplicates the object itself, not the sub-objects referenced within it; a deep copy recursively duplicates the object and all its sub-objects.

Key Points Explanation:

  1. Assignment Operation
    First, it is important to clarify that in Python, variables can be thought of as labels or names pointing to an object. When we use the assignment operator (=), we are not creating a new object but merely creating a new label (binding) for the same object.

    • Example:
      list_a = [1, 2, [3, 4]]
      list_b = list_a # Assignment operation
      
    • Explanation: Both list_a and list_b point to the same list object [1, 2, [3, 4]] in memory. Think of them as two remote controls operating the same television.
    • Verification:
      print(list_a is list_b) # Output: True (The `is` operator checks if two variables point to the same object)
      
    • Consequence: Modifying this shared object through either variable will reflect the change in the other.
      list_a[0] = 99
      print(list_a) # Output: [99, 2, [3, 4]]
      print(list_b) # Output: [99, 2, [3, 4]] (list_b also changed)
      
  2. Shallow Copy
    A shallow copy creates a new object containing references to the items of the original object. This is fine for immutable elements (e.g., integers, strings, tuples) because immutable objects themselves cannot be modified. However, issues arise with mutable elements (e.g., sublists, subdictionaries).

    • Creation Methods: For lists, you can use slice notation list[:] or the list.copy() method. The copy.copy() function from the copy module is a generic method for shallow copying.
    • Example:
      import copy
      list_a = [1, 2, [3, 4]]
      list_c = list_a.copy() # Or list_c = list_a[:] or list_c = copy.copy(list_a)
      
    • Explanation: list_c is a newly created list object; it is no longer the same object as list_a.
      print(list_a is list_c) # Output: False
      
    • Impact on the First Level: Modifying a first-level element (i.e., non-subobject) of list_a does not affect list_c.
      list_a[0] = 99
      print(list_a) # Output: [99, 2, [3, 4]]
      print(list_c) # Output: [1, 2, [3, 4]] (The first level of list_c is unaffected)
      
    • Impact on Nested Mutable Objects: The third element inside list_a and list_c (the sublist [3, 4]) is still the same object! Because a shallow copy only copies references. Modifying this shared sublist affects both lists.
      list_a[2][0] = 88
      print(list_a) # Output: [99, 2, [88, 4]]
      print(list_c) # Output: [1, 2, [88, 4]] (The sublist inside list_c was also modified!)
      
    • Diagram: After a shallow copy, list_a and list_c are two different "shells" (list objects), but their internal references to the sublist point to the same sublist object in memory.
  3. Deep Copy
    A deep copy solves the problem of shallow copies. It recursively duplicates the original object and all sub-objects it contains, creating a completely independent clone with no shared references between the old and new objects.

    • Creation Method: Use the copy.deepcopy() function from the copy module.
    • Example:
      import copy
      list_a = [1, 2, [3, 4]]
      list_d = copy.deepcopy(list_a)
      
    • Explanation: list_d is a brand new, completely independent object. Not only is list_d itself new, but even the sublist [3, 4] inside it has been recreated as a separate copy.
      print(list_a is list_d) # Output: False (The outer lists are different)
      print(list_a[2] is list_d[2]) # Output: False (The inner sublists are also different!)
      
    • Verification of Independence: Now, no matter which level (outer or inner) of the original object you modify, it will not affect the deeply copied object.
      list_a[0] = 99
      list_a[2][0] = 88
      print(list_a) # Output: [99, 2, [88, 4]]
      print(list_d) # Output: [1, 2, [3, 4]] (list_d remains completely unaffected)
      

Summary and Choice:

  • Assignment: Creates a reference (alias) to the object, sharing the same data.
  • Shallow Copy: Creates a new shell object, but the internal sub-objects remain references. Suitable for simple object structures without nested mutable objects, or when you explicitly intend to share internal data.
  • Deep Copy: Creates a completely independent clone. Suitable for complex object structures containing nested mutable objects where you need a fully independent copy. Deep copies are generally more time- and memory-intensive than shallow copies.

Understanding the differences among these three helps you make the right choice when handling data structures and avoid unintended data modification errors.